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Abstract 

We  study  the  inverse  problem  of  estimating  a  field  ua  from  data  comprising  a 
finite  set  of  nonlinear  functionals  of  ua ,  subject  to  additive  noise;  we  denote 
this  observed  data  by  y.  Our  interest  is  in  the  reconstruction  of  piecewise 
continuous  fields  ua  in  which  the  discontinuity  set  is  described  by  a  finite 
number  of  geometric  parameters  a.  Natural  applications  include  groundwater 
flow  and  electrical  impedance  tomography.  We  take  a  Bayesian  approach, 
placing  a  prior  distribution  on  ua  and  determining  the  conditional  distribution 
on  ua  given  the  data  y.  It  is  then  natural  to  study  maximum  a  posterior  (MAP) 
estimators.  Recently  (Dashti  et  al  2013  Inverse  Problems  29  095017)  it  has 
been  shown  that  MAP  estimators  can  be  characterised  as  minimisers  of  a 
generalised  Onsager-Machlup  functional,  in  the  case  where  the  prior  measure 
is  a  Gaussian  random  field.  We  extend  this  theory  to  a  more  general  class  of 
prior  distributions  which  allows  for  piecewise  continuous  fields.  Specifically, 
the  prior  field  is  assumed  to  be  piecewise  Gaussian  with  random  interfaces 
between  the  different  Gaussians  defined  by  a  finite  number  of  parameters.  We 
also  make  connections  with  recent  work  on  MAP  estimators  for  linear  pro¬ 
blems  and  possibly  non-Gaussian  priors  (Helin  and  Burger  2015  Inverse 
Problems  31  085009)  which  employs  the  notion  of  Fomin  derivative.  In 
showing  applicability  of  our  theory  we  focus  on  the  groundwater  flow  and  EIT 
models,  though  the  theory  holds  more  generally.  Numerical  experiments  are 
implemented  for  the  groundwater  flow  model,  demonstrating  the  feasibility  of 
determining  MAP  estimators  for  these  piecewise  continuous  models,  but  also 
that  the  geometric  formulation  can  lead  to  multiple  nearby  (local)  MAP  esti¬ 
mators.  We  relate  these  MAP  estimators  to  the  behaviour  of  output  from 
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MCMC  samples  of  the  posterior,  obtained  using  a  state-of-the-art  function 
space  Metropolis-Hastings  method. 

Keywords:  inverse  problems,  Bayesian  approach,  geometric  priors,  MAP 
estimators,  EIT,  groundwater  flow 

(Some  figures  may  appear  in  colour  only  in  the  online  journal) 

1.  Introduction 

1.1.  Context  and  literature  review 

A  common  inverse  problem  is  that  of  estimating  an  unknown  function  from  noisy  mea¬ 
surements  of  a  (possibly  nonlinear)  map  applied  to  the  function.  Statistical  and  deterministic 
approaches  to  this  problem  have  been  considered  extensively.  In  this  paper  we  focus  on  the 
the  study  of  MAP  estimators  within  the  Bayesian  approach;  these  estimators  provide  a  natural 
link  between  deterministic  and  statistical  methods.  In  the  Bayesian  formulation,  we  describe 
the  solution  probabilistically  and  the  distribution  of  the  unknown,  given  the  measurements 
and  a  prior  model,  is  termed  the  posterior  distribution.  MAP  estimators  attempt  to  work  with  a 
notion  of  solutions  of  maximal  probability  under  this  posterior  distribution  and  are  typically 
characterised  variationally,  linking  to  deterministic  methods. 

There  are  two  main  approaches  taken  to  the  study  of  the  posterior.  The  first  is  to 
discretise  the  space,  and  then  apply  finite  dimensional  Bayesian  methodology  [18].  An 
advantage  to  this  approach  is  the  availability  of  a  Lebesgue  density  and  a  large  amount  of 
previous  work  which  can  then  be  built  upon;  but  issues  may  arise  (for  example  computa¬ 
tionally)  when  the  dimension  of  the  discretisation  space  is  increased.  An  alternative  approach 
is  to  apply  infinite  dimensional  methodology  directly  on  the  original  space,  to  derive  algo¬ 
rithms,  and  then  discretise  to  implement.  This  approach  has  been  studied  for  linear  problems 
in  [12,  25,  27],  and  more  recently  for  nonlinear  problems  [10,  21,  22,  33].  It  is  the  latter 
approach  that  we  focus  on  in  this  paper. 

In  some  situations  it  may  be  that  point  estimates  are  more  desirable,  or  more  computa¬ 
tionally  feasible,  than  the  entire  posterior  distribution.  A  detailed  study  of  point  estimates  can 
be  found  in  for  example  [24].  Three  different  estimates  are  commonly  considered:  the  pos¬ 
terior  mean  which  minimises  L 2  loss,  the  posterior  median  which  minimises  L 1  loss,  and 
posterior  modes  which  minimise  zero-one  loss.  The  former  two  estimates  are  unique  [28],  but 
a  distribution  may  possess  more  than  one  mode.  A  consequence  of  this  is  that  the  posterior 
mean  and  median  may  be  misleading  in  the  case  of  a  multi-modal  posterior.  Posterior  modes 
are  often  termed  maximum  a  posteriori  (MAP)  estimators  in  the  literature. 

In  this  paper  we  focus  on  MAP  estimation.  If  the  posterior  has  Lebesgue  density  p,  MAP 
estimators  are  given  by  the  global  maxima  of  p.  The  problem  of  MAP  estimation  in  this  case 
is  hence  a  deterministic  variational  problem,  and  has  been  well-studied  [18].  In  the  infinite¬ 
dimensional  setting  there  is  no  Lebesgue  density,  but  there  has  been  recent  research  aimed  at 
characterising  the  mode  variationally  and  linking  to  the  classical  regularisation  techniques 
described  in,  for  example,  [9]  in  the  case  when  Gaussian  priors  are  adopted.  Non-Gaussian 
priors  have  also  been  considered  in  the  infinite  dimensional  setting — in  [14]  weak  MAP 
(wMAP)  estimators  are  defined  as  generalisations  of  MAP  estimators,  and  a  variational 
characterisation  of  them  is  provided  in  the  case  that  the  forward  map  is  linear,  using  the 
notion  of  Lomin  derivative. 
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Figure  1.  An  example  of  construction  of  a  piecewise  continuous  field,  using  two 
continuous  fields  and  two  scalar  parameters.  Here  the  scalar  parameters  determine  the 
points  where  the  interface  meets  each  side  of  the  domain.  We  work  on  the  space  of 
continuous  fields  and  parameters,  but  it  is  pushforward  of  these  by  the  construction 
map  that  represents  the  piecewise  continuous  field  we  aim  to  recover. 

In  this  paper  we  make  a  significant  extension  of  the  work  in  [9]  to  include  priors  which 
are  defined  by  a  combination  of  Gaussian  random  fields  and  a  finite  number  of  geometric 
parameters  which  define  the  different  domains  in  which  the  different  random  fields  apply.  We 
thereby  study  the  reconstruction  of  piecewise  continuous  fields  with  interfaces  defined  by  a 
finite  number  of  parameters.  Our  motivation  for  doing  so  comes  from  the  work  in  [5],  and  its 
predecessors.  In  that  paper  a  Bayesian  inverse  problem  for  piecewise  constant  fields,  mod¬ 
elling  the  permeability  appearing  in  a  two-phase  subsurface  flow  model,  was  studied.  Such 
piecewise  continuous  fields  were  also  previously  studied  in  a  groundwater  flow  context  in 
[16],  where  existence  and  well-posedness  of  the  posterior  distribution  were  shown.  The  idea 
of  single  point  estimates  being  misleading  is  discussed  and  the  existence  of  multiple  local 
MAP  estimators  is  shown.  We  also  link  our  work  to  that  in  [14],  by  characterising  the  MAP 
estimator  via  the  Fomin  derivative. 

Throughout  this  paper  we  focus  on  two  model  problems:  groundwater  flow  and  electrical 
impedance  tomography  (EIT).  Both  of  these  problems  are  important  examples  of  large  scale 
inverse  problems,  with  applications  of  great  economic  and  societal  value.  MAP  estimation  in 
such  problems  has  been  studied  previously  [2,  4,  17,  31].  However  our  formulation  is  quite 
general;  for  brevity  we  simply  illustrate  the  theory  for  groundwater  flow  and  EIT,  and  the 
numerics  only  in  the  case  of  groundwater  flow. 

1.2.  Mathematical  setting 

Let  X  be  a  separable  Banach  space  and  let  A  C  Rk.  X  should  be  thought  of  as  a  function  space 
and  A  a  space  of  geometric  parameters.  Given  (u,  a)  E  X  x  A,  we  construct  another  function 
ua  E  Z,  say.  Considering  the  ingredients  u  and  a  in  the  construction  of  this  function  ua 
separately  will  be  useful  in  what  follows.  An  example  of  such  a  construction  is  shown  in 
figure  1. 

Suppose  we  have  a  (typically  nonlinear)  forward  operator  Q  \  X  x  A  —>  Y,  where 
Y  =  RJ .  If  (u,  a)  denotes  the  true  input  to  our  forward  problem,  we  observe  data  y  E  Y  given 
by 


y  =  G(u,  a )  +  rj, 

where  rj  ~  N  (0,  T),  T  E  M/x/  positive  definite,  is  some  centred  Gaussian  noise  on  Y. 
Modelling  everything  probabilistically,  we  build  up  the  joint  distribution  of  (u,  a,  y)  by 
specifying  a  prior  distribution  /jl0  x  is0  on  (w,  a )  and  an  independent  noise  model  on  rj.  We  are 
then  interested  in  the  posterior  fi  on  ( u ,  a)  given  y.  Denote  |*|  the  Euclidean  norm  on  RJ ,  and 
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Figure  2.  Possible  sets  At  corresponding  to  example  2.1. 


Figure  3.  Possible  sets  Ah  corresponding  to  example  2.2. 


Figure  4.  Possible  sets  Ah  corresponding  to  example  2.3  in  the  case  p  =  1/2. 


for  any  positive  definite  A  G  M/x/  denote  \  -\A  \ A  1//2  •  |  the  weighted  norm  on  RJ .  Under 

certain  conditions,  using  a  form  of  Bayes’  theorem,  we  may  write  fi  in  the  form 

„(d„  d.)  oc  exp(-I  |S(„,  <0  - 

The  modes  of  the  posterior  distribution,  termed  MAP  estimators,  can  be  considered  ‘best 
guesses’  for  the  state  ( u ,  a)  given  the  data  y.  We  now  state  rigorously  what  we  mean  by  a 
MAP  estimator  for  /x,  as  in  [9].  Given  (u,  a)  G  X  x  A,  denote  by  B6  {u ,  a)  the  ball  of  radius 
6  centred  at  ( u ,  a). 

Definition  1.1  (MAP  estimator).  For  each  6  >  0,  define 
(us,  a6)  '==  argmax  p(Bd(u,  a)). 

(u,a)eXx  A 
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Figure  5.  Possible  sets  Ah  corresponding  to  example  2.4  in  the  case  K  =  11,  N  =  6. 


Any  point  (u,  a)  EX  x  A  satisfying 

a)) 

lim - - — - — —  =  1 

<Sjo  p(B6(u6,  a6)) 

is  called  a  MAP  estimator  for  the  measure  p. 

If  this  definition  is  applied  to  probability  measures  defined  via  a  Lebesgue  density,  MAP 

estimators  coincide  with  maxima  of  this  density.  Here  we  extend  the  notion  to  the  study  of 

piecewise  continuous  fields. 

1.3.  Our  contribution 

The  primary  contributions  of  the  paper  are  fourfold: 

(i)  We  develop  the  MAP  estimator  theory  for  infinite  dimensional  geometric  inverse 
problems  involving  discontinuous  fields,  building  on  theory  in  both  of  the  recent  papers 
[9,  14],  and  opening  up  new  avenues  for  the  study  of  MAP  estimators  in  infinite 
dimensional  inverse  problems. 

(ii)  We  explicitly  link  MAP  estimation  for  these  geometric  inverse  problems  to  a  variational 
Onsager-Machlup  minimisation  problem. 

(iii)  We  show  that  the  theory  applies  to  the  groundwater  flow  model  as  in  [16]  and  we  show 
that  the  theory  applies  to  the  EIT  problem  as  in  [11]. 

(iv)  We  implement  numerical  experiments  for  the  groundwater  flow  model  and  demonstrate 
the  feasibility  of  computing  (local)  MAP  estimators  within  the  geometric  formulation, 
but  also  show  that  they  can  lead  to  multiple  nearby  solutions.  We  relate  these  multiple 
MAP  estimators  to  the  behaviour  of  output  from  MCMC  to  probe  the  posterior. 

1.4.  Structure  of  the  paper 

•  In  section  2  we  describe  the  forward  maps  associated  with  the  groundwater  flow  and  EIT 
problems,  and  show  that  they  have  the  appropriate  regularity  needed  in  sections  4-5. 

•  In  section  3  we  describe  the  choice  of,  and  assumptions  upon,  the  prior  distribution  whose 
samples  comprise  piecewise  Gaussian  random  fields  with  random  interfaces. 

•  In  section  4  we  show  existence  and  uniqueness  of  the  posterior  distribution. 

•  In  section  5  we  define  MAP  estimators  and  prove  their  equivalence  to  minimisers  of  an 
appropriate  Onsager-Machlup  functional. 

•  In  section  6  we  present  numerics  for  the  groundwater  flow  problem.  We  consider  three 
different  prior  models  and  investigate  maximisers  of  the  posterior  distribution. 

•  In  section  7  we  conclude  and  outline  possible  future  work  in  the  area. 


5 


Inverse  Problems  32  (2016)  105003 


M  M  Dunlop  and  A  M  Stuart 


Figure  6.  An  example  domain  D ,  with  attached  electrodes  (<?/)f=],  for  the  EIT  problem. 


2.  The  forward  problem 

We  consider  two  model  problems.  Our  first  problem  (groundwater  flow)  is  that  of  deter¬ 
mining  the  piecewise  continuous  permeability  of  a  medium,  given  noisy  measurements  of 
water  pressure  (or  hydraulic  head)  within  it.  The  second  problem  (EIT)  is  determination  of 
the  piecewise  continuous  conductivity  within  a  body  from  boundary  voltage  measurements. 

In  what  follows,  the  finite  dimensional  space  A  will  be  a  space  of  geometric  parameters 
defining  the  interfaces  between  different  media,  and  X  will  be  a  product  of  function  spaces 
defining  the  values  of  the  permeabilities /conductivities  between  the  interfaces. 

We  begin  in  section  2.1  by  defining  the  construction  map  ( u ,  a )  i— »  ua  for  the  piecewise 
continuous  fields.  In  sections  2.2  and  2.3  we  describe  the  models  for  groundwater  flow  and 
EIT  respectively,  and  prove  regularity  properties  of  the  resulting  forward  maps;  these  prop¬ 
erties  are  required  for  our  subsequent  theory. 

2. 1.  Defining  the  interfaces 

Let  D  C  be  the  domain  of  interest  and  let  A  C  Rk  be  the  space  of  geometric  parameters. 
Take  a  collection  of  set- valued  maps  At  :  A  — ►  B(D),  i  =  1,...,  N  such  that  for  each  a  E  A 
we  have 


N 

IM(“)  -  d. 


Alia)  P|  Aj(a) 


0  if  i  ^  j. 


We  assume  that  each  map  At  is  continuous  in  the  sense  that 
\a  -  h\  -►  0  ^  \Ai(a)AAi(b)\  0, 

where  A  denotes  the  symmetric  difference: 

AAB  :=  (A\B)  U  (B\A). 


Let  X  =  C°(Z);  MN).  Given  u  =  (uh...,uN)  E  X  and  a  e  A  we  define  the  function 
ua  E  L°°(D )  by 
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N 

ua  =  F  (u,  a)  ■■=  J2  UilMa).  (2.1) 

i=  1 

Where  F  :  X  x  A  — >  L°°(D)  is  the  construction  map. 

We  give  four  examples  of  the  functions  At  and  the  sets/interfaces  they  define. 

Example  2.1.  Let  D  =  [0,  l]2,  A  =  [0,  l]2  and  N  =  2.  We  specify  points  a  and  b  on 
either  side  of  the  square  D  and  join  them  with  a  straight  line.  We  then  let  A\  (a,  b)  be  the 
region  of  D  below  this  line  and  A2(a,  b)  =  D\A\(a ,  b).  Example  sets  A^a,  b)  for  various 
parameters  a ,  b  are  shown  in  figure  2. 

Example  2.2.  Let  D  =  [0,  l]2,  A  =  [0,  l]2  and  N  =  2.  Choose  a  continuous  map 
H  :  A  ->  L°°([0,  1])  such  that  H(a,  b)( 0)  =  a  and  H(a,  b)(  1)  =  &  for  all  (a,  b)  £  A.  Let 
A]  (a,  &)  be  the  region  of  D  beneath  the  graph  of  the  curve  H  (a,  b)  and  let 
Alia,  b)  =  D\A\(a,  b).  This  setup  includes  the  previous  example: 
H(a ,  b)(x)  =  a  +  (b  —  a)x  defines  the  appropriate  straight  lines. 

The  continuity  of  A\  and  A2  can  be  seen  by  noting  that 

I Al(al,  b,)AA,  (a2,  b2)\  =  \A2(ah  b1)AA2(a2,  b2)\ 

<  f  \H(ah  b\){x)  -  H(a2,  b2)(x)\  dx 

Jo 

^\\H(ah  b,)  -  H (a2,  b2)\U 

and  using  the  continuity  of  H  into  L°°([0,  1]). 

For  example,  one  may  take  H  to  be  given  by 

H(a ,  b){x)  =  a  +  (b  —  a)x  +  v  sin(67rv)/10 

which  can  be  seen  to  be  continuous  into  L°°([0,  1]).  Example  sets  At(a,  b)  for  various 
parameters  a,  b,  with  this  choice  of  H ,  are  shown  in  figure  4. 


Example  2.3.  We  can  generalise  the  previous  example  to  allow  the  inclusion  of  a  fault.  Let 
D  =  [0,  l]2,  A  =  [0,  l]2  x  [—1,  1]  and  N  =  2.  Let  p  £  (0,  1)  denote  the  horizontal 
location  of  the  fault.  Given  H  :  [0,  l]2  — ►  L°°([0,  1])  as  in  the  previous  example,  define 
H  :  A  -►  L°°([0,  1])  by 


H(a ,  b,  c)(v) 


(H(a,  b)(x)  x  £  [0 ,p] 
[c  +  H{a ,  b)(x)  x  £  (p,  1] 


so  that  the  parameter  c  determines  the  (signed)  magnitude  of  the  fault.  Defining  the  sets 
Ai{a ,  b ,  c)  and  A2(a ,  b ,  c)  as  the  regions  of  D  beneath  and  above  the  curve  H(a ,  b ,  c) 
respectively,  the  continuity  can  be  seen  in  a  similar  manner  to  the  previous  example.  Example 
sets  Aiia ,  b ,  c )  for  various  parameters  a ,  b ,  c  are  shown  in  figure  3. 


Example  2.4.  Again  working  with  D  =  [0,  l]2,  but  with  a  much  larger  parameter  space, 
one  could  also  select  points  at  specific  v-coordinates  and  linearly  interpolate  between  them. 
Fix  K,  N  £  N  and  set  A  =  C  [0,  i]<A-!)x^  where  is  the  simplex 

£W_1=  ty-i)  e  [0,  l]^-1  I  0  ^  ...  <  <  1 }- 

Then  given  a  £  A,  define  the  functions  fi(a),  i  =  1, . . ., N  —  1,  to  be  the  linear  interpolation  of 

/ j _ i  \K 

the  points  1 — -,  )  .  A^a),  i  =  1, . . .,  N  —  1,  is  then  defined  to  be  the  region  between  the 

\K-l  J/j=l 
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graphs  of  the  functions  fi(a)  and  fi_1  (a),  and  AN  (a)  =  D\  U^Ll1  At  (a).  Example  sets  A^a) 
for  various  parameters  a  are  shown  in  figure  5. 

In  order  to  see  the  continuity  of  these  maps,  we  first  partition  the  domain  into  strips  Dj , 


Di  = 


jo*% 


y)eD 


j=l,...,K-  l 


so  that  we  have 


At  (a) 


K—  1 

U  Ma) 


j= i 


n  Dj. 


It  follows  from  properties  of  the  symmetric  difference  that 

K- 1 

|Af(fl)AAf(fe)|  <  H  Dj)A(Mb)  H  Dj) |. 

7=1 


It  hence  suffices  to  show  that  the  maps  A*(-)  p|  Dj  are  continuous  for  all  i,  j.  This  follows 
from  the  same  argument  as  in  example  2.2,  for  sufficiently  small  | a  —  b |. 


2.2.  The  Darcy  model  for  groundwater  flow 

We  consider  the  Darcy  model  for  groundwater  flow  on  a  domain  DCMJ,J=1,2,  3.  Let 
k  =  (Ky)  denote  the  permeability  tensor  of  the  medium,  p  the  pressure  of  the  water,  and 
assume  the  viscosity  of  the  water  is  constant.  Darcy’s  law  [8]  tells  us  that  the  velocity  is 
proportional  to  the  gradient  of  the  pressure: 

v  =  —nVp. 

Additionally,  a  local  form  of  mass  conservation  tells  us  that 
V-  v=f. 

Combining  these  two  equations,  and  imposing  Dirichlet  boundary  conditions  for  simplicity, 
results  in  the  PDE 


f-V  •  (k Vp)  =f  in  D 
\p  =  g  on  dD. 

This  is  the  PDE  we  will  consider  in  the  forward  model,  and  it  gives  rise  to  a  solution 
map  K,  \ —>  p. 

For  simplicity  we  will  work  in  the  case  where  n  is  an  isotropic  (scalar)  permeability, 
bounded  above  and  below  by  positive  constants,  and  so  it  can  be  represented  as  the  image  of 
some  bounded  function  under  a  positive  continuously  differentiable  map  cr  :  M  — >  M+. 

Let  V  =  Hl(D ),  the  Sobolev  space  of  once  weakly  differentiable  functions  on  D  [13]. 
Then  given  /  £  H~l{D ),  g  £  Z/1/2  {dD),  u  £  X  and  a  £  A,  define  pu  a  £  V  to  be  the  solution 
of  the  weak  form  of  the  PDE 


f-V  •  (°(ua)Vpua)  =f  in  D 

[Pu,a  =8  011  9D- 


(2.2) 


We  are  first  interested  in  the  regularity  of  the  map  7Z  :  X  x  A  — ►  V  given  by  7 Z(u,  a)  =  pua. 
We  first  recall  what  it  means  for  pua  to  be  a  solution  of  (2.2).  Since  g  £  ZZ1//2(9D),  by  the 
trace  theorem  [13]  there  exists  G  £  V  such  that  tr(G)  =  g.  The  solution  pu  a  of  (2.2)  is  then 
given  by  pu  a  =  qu  a  +  G,  where  qu  a  £  Hq(D)  solves  the  PDE 
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J-V  •  (c j(ua)Vqua/ )  mf+  v  •  (cr(OVG)  in  D 

\qu  a  =  0  on  dD.  (  J  j 

The  following  lemma  tells  us  that  the  map  1Z  is  well  defined  and  has  certain  regularity 
properties.  Its  proof  is  given  in  the  appendix. 

Lemma  2.5.  The  map  1Z  :  X  x  A  —>  V  is  well-defined  and  satisfies: 

(i)  for  each  ( u ,  a)  e  X  x  A, 

\\TZ(u,  a) \\v  <  (H/llv*  +  ||<T(M‘!)||L»||G||v)/Kmin(M,  a)  +  ||G||y, 
where  Km\n  (u,  a)  is  given  by 

nmm  (u,  a)  =  essinf  a(ua(x))  >  0; 

xeD 


(ii)  for  each  a  G  A,  7£(*,  a)  :  X  — >  V  is  locally  Lipschitz  continuous,  i.e.  for  every  r  >  0 
there  exists  L(r)  >  0  such  that,  for  all  u,  v  G  X  with  \\u\\x,  ||v||x  <  r  and  all  a  G  A,  we 
have 


|| TZ(u,  a)  -  K(v,  a)\\v  <  L(r)||w  -  v\\x; 

(iii)  for  each  u  G  X,  1Z(u,  •)  :  A  — ►  V  is  continuous. 

We  now  choose  a  continuous  linear  observation  operator  t  :  V  — >  M7.  For  example, 
writing  l  =  we  could  take 

4(P)  =  f  -  - -j7Ie~lx~yl2/2£p(y)dx,  (2.4) 

(27TS)d/- 

for  some  5  >  0,  so  that  ^  approximates  a  point  observation  at  the  point  xt  e  D.  Our  forward 
operator  :  X  x  A  — >  RJ  is  then  defined  by  Q  —  l  o  7Z,  so  that  it  can  be  written  as  the 
composition 

( u ,  a)  ^  ua  ^  k  =  cr{ua)  i  ^  p  i  *  ^(/?). 

From  the  above  regularity  of  7Z  we  can  deduce  the  following  regularity  properties  of  our 
forward  operator  Q : 

Proposition  2.6.  Define  the  map  Q  :  X  x  A  — >  M.J  as  above.  Then  Q  satisfies 

(i)  For  each  r  >  0  and  u,  v  E  X  with  \\u\\x,  ||v||x  <  r,  there  exists  C(r)  >  0  such  that  for 
all  a  E  A, 

I Q(u,  a)  -  Q(y,  a) \  <  C(r)||w  -  v||x. 

(ii)  For  each  u  E  X,  the  map  Q(u ,  •)  :  A  — ►  W  is  continuous. 

Proof. 

(i)  The  map  t  is  defined  to  be  a  continuous  linear  functional,  and  so  in  particular  is  Lipschitz. 
Since  we  have  Q  =  l  o  7Z  the  result  follows  from  lemma  2.5 (ii). 

(ii)  This  follows  from  the  continuity  of  i  and  lemma  2.5(iii).  □ 
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2.3.  The  complete  electrode  model  ( CEM)  for  EIT 

EIT  is  an  imaging  technique  that  aims  to  make  inference  about  the  internal  conductivity  of  a 
body  from  surface  voltage  measurements.  Electrodes  are  attached  to  the  surface  of  the  body, 
current  is  injected,  and  the  resulting  voltages  on  the  electrodes  are  measured.  Applications 
include  both  medical  imaging,  where  the  aim  is  to  non-invasively  detect  internal  abnormal¬ 
ities  within  a  human  patient,  and  subsurface  imaging,  where  material  properties  of  the  sub¬ 
surface  are  differentiated  via  their  conductivities.  Early  references  include  [15]  in  the  context 
of  medical  imaging  and  [20]  in  the  context  of  subsurface  imaging. 

The  CEM  is  proposed  for  the  forward  model  in  [32],  and  shown  to  agree  with  exper¬ 
imental  data  up  to  measurement  precision.  In  its  strong  form,  the  PDE  reads 


—V  •  (tt(x)Vv(x))  =  0 

iED 

f  «^ds  =  ;, 

Jet  dn 

/=  1,  ...,L 

<  dv 

*(*)—(*)  =  o 

dn 

L 

x  e  dD\  U  ei 

i=  l 

(2.5) 

v(x)  +  ZlK  (x)^-{x)  =  Vl 
dn 

x  E  /V  1,  ...,L. 

The  domain  D  represents  the  body,  and  (<?/  )f= }  C  d D  the  electrodes  attached  to  its 
surface  with  corresponding  contact  impedances  ( Zi)f=i .  Figure  6  shows  an  example  domain 
and  attached  electrodes.  A  current  7/  is  injected  into  each  electrode  eh  and  a  voltage  mea¬ 
surement  Vi  made.  Here  n  represents  the  conductivity  of  the  body,  and  v  the  potential  within 
it.  Note  that  the  solution  comprises  both  a  function  v  E  Hl(D)  and  a  vector  (V/)f=1  E  ML  of 
boundary  voltage  measurements. 

A  corresponding  weak  form  exists,  and  is  shown  to  have  a  unique  solution  (up  to 
constants)  given  appropriate  conditions  on  n,  (zi)i=\  and  (7/)f=1 — see  [32]  for  details. 
Moreover,  under  some  additional  assumptions,  the  mapping  k  (V/)f=1  is  known  to  be 
Frechet  differentiable  when  we  equip  the  conductivity  space  with  the  supremum  norm  [17]. 

We  can  apply  different  current  stimulation  patterns  to  the  electrodes  to  yield  additional 
information.  Assume  that  we  have  M  different  (linearly  independent)  current  stimulation 
patterns  (7(m))^=1.  This  yields  M  different  mappings  n  i— >  (V/m))f=1  each  with  the  regularity 
above,  or  equivalently  a  mapping  n  i— ►  V  where  V  E  M7  with  J  =  LM. 

Analogously  to  the  Darcy  model  case,  we  will  consider  isotropic  conductivities  of  the 
form  k  =  o  ( ua ),  where  a  :  M  — >  M+  is  positive  and  continuously  differentiable.  Our  forward 
operator  Q  :  X  x  A  — >  RJ,  is  then  given  by  the  composition 

(m,  a)  ua  ^  k  =  o(ua)  i  ^  ((v(1),  V(1) ),..., (v(M),  V(M)))  i — > 

We  show  in  the  appendix  that  the  map  defined  in  this  way  has  the  same  regularity  as  the  map 
corresponding  to  the  Darcy  model. 

Proposition  2.7.  Define  the  map  Q  :  X  x  A  — >  W  as  above.  Then  Q  satisfies 

(i)  For  each  r  >  0  and  u,  v  E  X  with  ||w||x,  ||v||x  <  r>  there  exists  C(r)  >  0  such  that  for 
all  a  E  A, 

I G(u,  a)  -  Q(v,  a) \  ^  C(r)\\u  v\\x- 

(ii)  For  each  u  E  X,  the  map  Q(u ,  •)  :  A  — >  M/  is  continuous. 
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3.  Onsager-Machlup  functionals  and  prior  modelling 

In  this  section  we  recall  the  definition  of  an  Onsager-Machlup  functional  for  a  measure  which 
is  equivalent2  to  a  Gaussian  measure.  We  then  introduce  the  prior  measures  that  we  will 
consider,  first  on  the  function  space  X ,  then  the  geometric  parameter  space  A,  and  finally  the 
product  space  X  x  A.  We  conclude  the  section  by  extending  the  definition  of  Onsager- 
Machlup  functional  so  that  it  is  appropriate  for  the  measures  we  consider  here,  supported  on 
fields  and  geometric  parameters  which  are  combined  to  make  piecewise  continuous  functions. 

3.1.  Onsager-Machlup  functionals 

The  Onsager-Machlup  functional  of  a  measure  is  the  negative  logarithm  of  its  Lebesgue 
density  when  such  a  density  exists,  and  otherwise  can  be  thought  of  analogously.  We  start  by 
defining  it  precisely  for  measures  defined  via  density  with  respect  to  a  Gaussian,  allowing  for 
infinite  dimensional  spaces  on  which  Lebesgue  measure  is  not  defined.  Suppose  that  p  is  a 
measure  equivalent  to  a  Gaussian  measure  p0.  Then  the  Onsager-Machlup  functional  for  p  is 
defined  as  follows. 


Definition  3.1  (Onsager-Machlup  functional  I).  Let  p  be  a  measure  on  a  Banach  space  Z 
which  is  equivalent  to  p0,  where  p0  is  a  Gaussian  measure  on  Z  with  Cameron-Martin  space 
E.  Let  B6(z)  denote  the  ball  of  radius  6  centred  at  z  G  Z.  A  functional  I  :  Z  — >  M  is  called  the 
Onsager-Machlup  functional  for  p  if,  for  each  x,  y  £  E, 


lim/^x)) 

<510  n(Bs(y)) 


exp  (I(y)  -  I(x)) 


and  /(x)  =  (X)  for  x  ^  E. 


Remarks  3.2. 

(i)  The  Onsager-Machlup  functional  is  only  defined  up  to  addition  of  a  constant. 

(ii)  If  Z  is  finite  dimensional  and  p  admits  a  positive  Lebesgue  density  p,  then 
7(x)  =  —  lo gp(x)  for  all  x  G  Z.  In  light  of  the  previous  remark,  this  is  true  even  if  p 
is  not  normalised. 

(iii)  Let  Z  =  W1  be  finite  dimensional,  and  let  p0  =  N  (0,  E)  be  a  Gaussian  measure  on  Z.  Let 
T  G  Rmxm  be  a  positive-definite  matrix,  A  G  Mmx"  and  y  G  Mm.  Define  p  by 

f~ix)  oc  expf“T  lAx  -  y\r 

d  /u0  V  2 

so  that 

‘fix)  oc  exp[-^  lAx  -  y\r  -  I  Ml)- 

Then  by  the  previous  remark,  the  Onsager-Machlup  functional  for  p  is  given  by 
/  (x)  =  I  \Ax  -  y\l  +  I  Ml 

for  all  x  G  Z,  which  is  a  Tikhonov  regularised  least  squares  functional. 


2 

Two  measures  v,  fi  on  a  measurable  space  (Af,  M)  are  equivalent  if  v(A)  =  0  if  and  only  if  p{A)  =  0, 
for  A  G  M. 
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(iv)  The  preceding  example  (iii)  may  be  extended  to  an  infinite  dimensional  setting.  Let  Z  be 
a  separable  Banach  space,  and  let  /i0  =  A(0,  Co)  be  a  Gaussian  measure  on  Z  with 
Cameron-Martin  space  ( E ,  (•,  •)#,  ||-||£).  Let  T  G  Mmxm  be  a  positive-definite  matrix, 
A  :  X  — >  Mm  a  bounded  linear  operator  and  y  G  Mm.  Define  //  by 

-7^-0)  oc  exp(-^  | Ax  -  y|p 

d/u0  V  2 

Then  theorem  3.2  in  [9]  tells  us  that  the  Onsager-Machlup  functional  for  //  is  given  by 
j  \Ax  -  y\l  +  j  ||x|||. 

(iv)  In  this  paper,  the  posterior  distribution  will  be  a  measure  on  the  product  space 
Z  =  X  x  A.  The  prior  distribution  will  be  an  independent  product  of  a  Gaussian  on  X 
and  a  compactly  supported  measure  on  A.  Due  to  the  assumption  of  compact  support,  the 
prior  will  not  be  equivalent  to  a  Gaussian  measure  on  Z  and  so  the  above  definition  does 
not  apply;  we  provide  a  suitable  extension  to  the  definition  in  section  3.4. 

As  we  are  taking  a  Bayesian  approach  to  the  inverse  problem,  we  incorporate  our  prior 
beliefs  about  the  permeability /conductivity  into  the  model  via  probability  measures  on  X  and 
A.  We  will  combine  these  into  a  prior  measure  on  the  product  space  X  x  A.  We  equip  this 
space  with  any  (complete)  norm  ||(-,  -)||  such  that  if  ||(w,  a)\\  — >  0,  then  \\u\\x  — ►  0 
and  \a\  — >  0. 

3.2.  Priors  for  the  fields 

We  wish  to  put  priors  on  the  fields  u\,...,uN  G  C°(D).  We  use  independent  Gaussian  mea¬ 
sures  Ui  rsj  /j!q  :=  N  (nti,  Q,  where  the  means  m*  G  C°(D ),  and  each  covariance  operator 
C{.  C°(D )  — ►  C°(D )  is  trace-class  and  positive  definite.  It  follows  that  the  vector 
(u\,  ...,uN)  ~  /Xq  x  ...  x  /j,q  =:  /x0  is  Gaussian  on  X: 

Mo  =  frn,  ©Cij, 

where  m  =  (mi, . G  X.  If  Ei  denotes  the  Cameron-Martin  space  [10]  of  /il0,  then  that  of 
/i0  is  given  by 

N 

E  =  ©£, 

i=  1 

with  inner  product  given  by  the  sum  of  those  of  its  component  spaces. 

The  Onsager-Machlup  functional  of  /i0  is  known  to  be  given  by 

j  || u  —  m|||  u  —  m  G  E 
00  u  —  m  0  E. 

This  can  be  seen,  for  example,  as  a  consequence  of  proposition  18.3  in  [26]. 

Remark  3.3.  We  may  assume  that  the  different  fields  are  correlated  under  the  prior,  so  long 
as  /i()  remains  Gaussian  on  X — this  does  not  affect  any  of  the  following  theory.  Allowing 
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correlations  between  the  fields  and  the  geometric  parameters  under  the  prior  is  a  more 
technical  issue  however,  and  so  we  will  assume  that  these  are  independent. 

Example  3.4.  Define  the  negative  Laplacian  with  Neumann  boundary  conditions  as  follows: 

A  =  -A,  V(A)  =  {  u  £  H2(D) |  —  =  0  on  dD,  f  u(x)dx  =  0 
l  dv  Jd 

Then  A  is  invertible.  We  can  define  Ct  =  A~ai ,  where  each  at  >  d/2.  Then  each  Ct  is  trace- 
class  and  positive  definite,  and  samples  from  each  glQ  will  be  almost  surely  continuous  and  so 
g0  can  be  considered  as  a  Gaussian  measure  on  X.  Moreover,  regularity  of  the  samples  will 
increase  as  increases,  see  [10]  for  details. 


3.3.  Priors  for  the  geometric  parameters 


We  also  want  to  put  a  prior  measure  on  the  geometric  parameters,  i.e.  we  want  to  choose  a 
probability  measure  on  A.  Since  ACR^  the  analysis  is  more  straightforward  than  the  infinite 
dimensional  case.  Let  v  be  a  probability  measure  on  A  with  compact  support  S  C  A.  We 
assume  v  is  absolutely  continuous  with  respect  to  the  Lebesgue  measure  and  that  its  density  p 
is  continuous  on  S.  Despite  being  defined  on  a  finite  dimensional  space,  the  measure  v  is  not 
necessarily  equivalent  to  the  Lebesgue  measure  on  the  whole  of  Rk  and  so  the  previous 
definition  of  Onsager-Machlup  functional  does  not  apply.  We  hence  must  formulate  a  new 
definition  for  this  case. 

Since  p  >  0  on  int(S),  we  can  use  the  continuity  of  p  to  calculate  the  limits  of  ratios  of 
small  ball  probabilities  for  v  on  int(S).  Let  ah  a2  £  int(S),  then 


lim 

<S|0  v(B6(d2)) 


=  lim 

Si  0 


=  lim 

Si  0 


J  pig)  da 

B6(a  i) 


Jp(a)da 

B6(a2 ) 

rgirn  /.,„/(°)da 

'’(0)do 


=  pia  i) 
p(a2) 

=  exp(log/9(ai)  -  log  p(a2)). 


If  either  ax  or  a2  lie  outside  of  S  the  limit  can  be  seen  to  be  0  or  oo  respectively.  It  hence 
makes  sense  to  define  the  Onsager-Machlup  functional  for  v  on  A\dS  as 

,  f— log  pia)  a  €  int(S) 

K{a)  =  \ 

[oo  a  S. 

For  a  e  dS,  we  define  K(a )  to  be  the  limit  of  K  from  the  interior: 

K  (a)  =  —  lim  log  p  (b)  a  G  dS 

b^a 
be  intOS) 


which  is  well  defined  due  to  the  continuity  of  p  on  int(S’).  K  is  then  continuous  on  the  whole 
of  S. 
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Remark  3.5.  If  we  were  to  define  K  on  dS  in  the  same  way  that  we  defined  it  on  A \dS,  K 
would  have  a  positive  jump  at  the  boundary  related  to  the  geometry  of  S.  This  would  mean 
that  K  was  not  lower  semi-continuous  on  S  which  would  cause  problems  when  seeking 
minimisers.  The  definition  we  have  chosen  is  appropriate:  if  any  minimising  sequence 
(an)n^ i  ^  int(S)  of  K  has  an  accumulation  point  on  dS ,  then  v  has  a  mode  at  that  point. 

If  we  have  no  prior  knowledge  about  the  interfaces  and  A  is  compact,  we  could  place  a 
uniform  prior  on  the  whole  of  A.  Otherwise  we  could  either  choose  a  prior  with  smaller 
support,  or  one  that  weights  certain  areas  more  than  others. 


3.4.  Priors  on  X  x  A 

We  assume  that  the  priors  on  the  fields  and  the  geometric  parameters  are  independent,  so  that 
we  may  take  the  product  measure  fi0  x  z/0  as  our  prior  on  X  x  A.  Note  that  if 
F  :  X  x  A  — >  L°°(D )  denotes  the  construction  map  ( u ,  a )  i— »  ua  defined  earlier  by  (2.1),  then 
our  prior  permeability /conductivity  distribution  on  L°°(D )  is  given  by  the  pushforward3 
M o  =  F*(b  o  x  i/q).  This  is  much  more  cumbersome  to  deal  with  however,  since  for  example 
L°°  (. D )  is  not  separable.  It  is  for  this  reason  we  incorporate  the  mapping  F  into  the  forward 
map  Q.  Assuming  now  that  the  prior  p0  x  z/0  is  as  described  above,  we  can  define  the 
Onsager-Machlup  functional  for  measures  fi  on  X  x  A  which  are  equivalent  to  p0  x  z/0- 


Definition  3.6  (Onsager-Machlup  functional  II).  Let  fi  be  a  measure  on  X  x  A  equivalent 
to  fi0  x  i/0,  where  p0  and  z/0  satisfy  the  assumptions  detailed  above.  Let  B6  (u,  a)  denote  the 
ball  of  radius  6  centred  at  (u,  a)  E  X  x  A.  A  functional  /  :  X  x  A  —>  M  is  called  the 
Onsager-Machlup  functional  for  p  if, 

(i)  for  each  ( u ,  a ),  (v,  b)  G  E  x  intOS), 


fi(B6(u,  a)) 

hm - - - 

<510  b)) 


=  exp(/(v,  b )  —  I(u ,  a)); 


(ii)  for  each  (, u ,  a)  e  E  x  dS , 


/(w,  a) 


lim 

b—fCi 

benitOS) 


/(m,  fc); 


(iii)  /(w,  a)  =  oc  for  w  0  E  or  a  0  5. 


4.  Likelihood  and  posterior  distribution 

We  return  to  the  abstract  setting  mentioned  in  the  introduction.  Let  X  be  a  separable  Banach 
space,  ACM^  and  Y  =  RJ .  Suppose  we  have  a  forward  operator  Q  :  X  x  A  — ►  Y .  If  (u,  a) 
denotes  the  true  input  to  our  forward  problem,  we  observe  data  y  G  Y  given  by 

3 

Given  a  measurable  map  F  :  (X,  X)  — >•  (7,  T)  between  two  measurable  spaces,  the  pushforward  of  a  measure  fi 
on  X  is  the  measure  F#/i  on  Y  defined  by  (A)  =  fi  (F- 1  (A) )  for  A  G  y .  If  a  random  variable  u  on  X  has  law  /x, 
then  the  random  variable  F(u )  on  7  has  law 
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y  =  G(u,  a)  +  Tj, 

where  ij  ~  Q0  :=  N( 0,  T),  T  G  Rjxj  positive  definite,  is  Gaussian  noise  on  Y  independent  of 
the  prior. 

It  is  clear  that  we  have  y\(u ,  a)  ~  QM>fl  :=  N (G(u,  a),  T).  We  can  use  this  to  formally 
find  the  distribution  of  (u,  a)  \y.  First  note  that 

Q«,a(dy)  =  exp^-$(w,  a;  y)  +  j  lylrjQo(dy), 

where  the  potential  (or  negative  log-likelihood)  $:XxAxf^Ris  given  by 

1  9 

$(w,  a;  y)  =b?  -  | Q(u,  a)  -  y ft.  (4.1) 

Hence  under  suitable  regularity  conditions,  Bayes’  theorem  tells  us  that  the  distribution  //  of 
(u,  a)\y  satisfies 

p(du,  do)  oc  exp  (— 4>(z/,  a ;  y))/r0(dw)z/o(d<2) 

after  absorbing  the  exp^  |y|rj  term  into  the  normalisation  constant. 

We  now  make  this  statement  rigorous.  To  keep  the  situation  general,  we  do  not  insist  that 
<f>  takes  the  form  (4.1),  and  instead  assert  only  that  <f>  satisfies  the  following  assumptions. 

Assumptions  4.1.  There  exists  X'  x  A'  C  X  x  A  such  that 

(i)  for  every  5  >  0  there  is  an  M\  (e)  G  M  such  that  for  all  u  G  X'  and  all  a  G  A' 

<3?(w,  a;  >’)  ^  M\(e)  -  e  ||w|||; 


(ii)  for  each  u  G  X'  and  y  G  Y,  the  potential  <f>(w,  •;  y)  :  A'  — >  M  is  continuous; 

(iii)  there  exists  a  strictly  positive  M2  :  M+  x  M+  x  M+  — >  M+  monotonic  non-decreasing 
separately  in  each  argument,  such  that  for  each  r  >  0,  u  G  X1  and  a  G  A',  and  y1?  y2  £  7 
with  1^1,  \y2\  <  r, 

|$(m,  a;  yj)  -  $(w,  a;  y2)|  <  M2(r,  \\u\\x,  |a|)|;y1  -  y2\; 


(iv)  there  exists  a  strictly  positive  M3  :  M+  x  Ax  7  —>  M+,  continuous  in  its  second 
component,  such  that  for  each  r  >  0,  a  G  A'  and  y  G  Y,  and  u\,  u2  G  X'  with 
\\ui\\x,  \W2Wx  <  r, 

\$(uh  a ;  y)  -  4>(m2,  <3;  y)|  <  Af3(r,  a,  y)||wi  -  m2||x- 


These  assumptions  are  used  in  the  proof  of  existence  and  well-posedness  of  the  posterior 
distribution,  which  is  given  in  the  appendix: 

Theorem  4.2  (Existence  and  well-posedness).  Let  assumptions  4.1  hold.  Assume  that 
(/i0  x  u0)(Xf  x  A')  =  1,  and  that  (p0  x  uq)((X'  x  A')  f)  B)  >  0  for  some  bounded  set 
B  C  X  x  A.  Then 

(i)  4?  is  p0  x  i/q  x  ^-measurable; 
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(ii)  for  each  y  E  Y,  Z  (y)  given  by 

Z(y)  =  /  exp(  —  &(u,  a ;  y))nn(dw)^o(d<2) 

JXxA 

is  positive  and  finite,  and  so  the  probability  measure  py, 

py(du,  da)  =  — - — exp(— <F(w,  a ;  y)) pAdu)vo(da) 

Z(y) 


is  well-defined. 

(iii)  Assume  additionally  that,  for  every  fixed  r  >  0,  there  exists  e  >  0  with 
exp(e  ||k|||)(1  +  M2(r,  \\u\\x,  \a\)2)  G  L^^iX  x  A;  K). 
Then  there  is  C(r)  >  0  such  that  for  all  y,  y'  E  Y  with  \y\,  |y'|  <  r, 
^HellO^,  Y)  C\y  -  y|. 


(4.2) 


Remark  4.3.  In  this  paper  we  are  focused  on  the  case  when  the  field  prior  p0  is  taken  to  be 
Gaussian.  However,  the  above  existence  and  well-posedness  result  still  holds  if,  for  example, 
p0  is  taken  to  be  Besov  rather  than  Gaussian,  since  a  Femique-type  theorem  holds  for  such 
priors  [10,  23]. 

We  show  that  for  both  choices  of  test  models,  the  potential  (4.1)  satisfies  assump¬ 
tions  4.1: 

Proposition  4.4.  Let  X  =  C°(D;  Mfi),  and  let  Q  :  X  x  A  — ►  Y  denote  the  forward  map 
corresponding  to  either  the  groundwater  flow  or  EIT  problem,  as  detailed  in  section  2.  Let 
y  E  Y  and  let  Y  E  M/x/  be  positive  definite.  Define  the  potential  :  X  x  A  x  Y  —>  M  by 

1  2 

$(w,  a\y)m  -  | Q(u,  a)  -  y|r. 

Then  <T>  satisfies  assumptions  4.1,  with  X'  x  A'  =  X  x  A. 

Proof. 

(i)  ^  0  so  this  is  true  with  M\  =  0. 

(ii)  Fix  u  E  X'  and  y  E  Y.  Propositions  2.6  and  2.7  tell  us  that  Q(u,  •)  is  continuous  for  either 
choice  of  test  model.  The  map  \z  —  y\r  is  continuous,  and  so  <F(i/,  •;  y)  is 
continuous  too. 

(iii)  A  consequence  of  propositions  2.6  and  2.7  is  that  for  each  u  E  X  and  a  E  A,  Q(u ,  a)  can 
be  bounded  in  terms  of  \\u\\x  and  \a\.  The  result  then  follows  from  the  local  Lipschitz 
property  of  the  map  y  i— ►  \y\2. 

(iv)  Propositions  2.6  and  2.7  tell  us  that  (?(•,  a)  is  locally  Lipschitz  for  either  choice  of  test 

model.  The  map  z  ' — ^  |z  —  y|p  is  locally  Lipschitz,  and  hence  we  conclude  that  T>(-,  a ;  y) 
is  locally  Lipschitz,  with  Lipschitz  constant  independent  of  a.  □ 

With  a  choice  of  prior  as  described  in  section  3,  we  can  therefore  apply  theorem  4.2  in 
the  cases  where  the  forward  map  is  one  of  the  two  described  in  section  2  and  the  observational 
noise  is  Gaussian.  In  this  case,  the  constant  M2(r,  ||«||x,  \a\)  appearing  in  assumptions  4.1  (iii) 
is  independent  of  \\u\\x  and  \a\,  and  so  the  integrability  condition  (iii)  in  theorem  4.2  always 
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holds  via  Femique’s  theorem.  The  condition  on  positivity  of  a  bounded  set  can  be  seen  by 
taking,  for  example,  B  =  ^(0)  x  S,  where  S  is  the  (compact)  support  of  z/0. 

5.  MAP  estimators 

In  section  5.1  we  characterise  the  MAP  estimators  for  the  posterior  p  in  terms  of  the  Onsager- 
Machlup  functional  for  p.  In  section  5.2  we  relate  this  Onsager-Machlup  functional  to  the 
Fomin  derivative  of  p,  with  reference  to  the  work  [14]. 

5.1.  MAP  estimators  and  the  Onsager-Machlup  functional 

Throughout  this  section  we  assume  that  p  is  given  by  (4.2).  Furthermore  we  assume  that  p0 
has  mean  zero  for  simplicity.  Additionally,  when  we  assume  that  assumptions  4.1  hold,  we 
will  assume  that  X'  x  A'  =  X  x  A. 

Suppressing  the  dependence  of  T>  on  the  data  y  since  it  is  not  relevant  in  the  sequel,  we 
define  the  functional  I  :  X  x  A  — ►  M  by 

I  (u ,  a )  =  <&(u,  a )  +  J  (u)  +  K(a),  (5.1) 

where  J ,  K  are  as  defined  in  sections  3.2  and  3.3  respectively.  In  this  section  we  attain  the 
following  three  results  concerning  I  and  p,  which  are  proved  in  the  appendix. 

Theorem  5.1.  Let  assumptions  4.1  hold.  Then  the  function  I  defined  by  (5.1)  is  the 
Onsager-Machlup  functional  for  p,  where  the  Onsager-Machlup  functional  is  as  defined  in 
definition  3.6. 

Theorem  5.2.  Let  assumptions  4.1  hold.  Then  there  exists  (u,  a)  E  E  x  S  such  that 
I (u,  a)  =  inf [I(u,  a)\u  G  E,  a  G  S}. 

Furthermore,  if  (um  an)n^\  is  a  minimising  sequence  satisfying  I(un,  an)  — >•  I  (u,  a),  then 
there  is  a  subsequence  (unv  ank\^\  converging  to  ( u ,  a)  ( strongly )  in  E  x  S. 

Theorem  5.3.  Let  assumptions  4.1  hold.  Assume  also  that  there  exists  an  M  E  R  such  that 
§(u,  a)  ^  M  for  any  ( u ,  a)  E  X  x  A. 

(i)  Let  ( u 6,  a6)  =  argmax  p(B6 (u ,  a)).  There  is  a  (u,  a)  E  E  x  S  and  a  subsequence  of 

(u,a)£Xx  A 

(u6,  </)«s>o  which  converges  to  ( u ,  a)  strongly  in  X  x  A. 

(ii)  The  limit  ( u ,  a)  is  a  MAP  estimator  and  minimiser  of  1. 

A  consequence  of  theorem  5.3  is  that,  under  its  assumptions,  MAP  estimators  and 
minimisers  of  the  Onsager-Machlup  functional  are  equivalent.  The  proof  of  this  corollary  is 
identical  to  that  of  corollary  3.10  in  [9]: 

Corollary  5.4.  Under  the  conditions  of  theorem  5.3  we  have  the  following. 

(i)  Any  MAP  estimator  minimises  the  Onsager-Machlup  functional  I. 

(ii)  Any  (w*,  a*)  E  E  x  S  which  minimises  the  Onsager-Machlup  functional  I  is  a  MAP 
estimator  for  the  measure  p  given  by  (4.2). 
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5.2.  The  Fomin  derivative  approach 

In  recent  work  of  Helin  and  Burger  [14],  the  concept  of  MAP  estimators  was  generalised  to 
wMAP  estimators  using  the  notion  of  Fomin  differentiability  of  measures.  The  definition  of 
wMAP  estimators  is  such  that  if  u  is  a  MAP  estimator  then  it  is  a  wMAP  estimator,  but  not 
necessarily  vice  versa.  Under  certain  assumptions,  they  show  that  wMAP  estimators  are 
equivalent  to  minimisers  of  a  particular  functional.  The  assumptions  do  not  hold  in  our  case, 
since  our  forward  map  is  nonlinear  and  our  prior  p0  x  is  not  necessarily  convex,  however 
the  functional  agrees  with  our  objective  functional  I.  Thus  in  what  follows  we  provide  a  link 
between  the  Fomin  derivative  of  the  posterior  p  and  our  objective  functional  I. 

The  Fomin  derivative  of  a  measure  on  a  Banach  space  X  equipped  with  its  Borel  ri¬ 
al  gebra  B(X)  is  defined  as  follows. 


Definition  5.5.  A  measure  A  on  X  is  called  Fomin  differentiable  along  the  vector  z  E  X  if, 
for  every  set  A  E  B(X),  there  exists  a  finite  limit 


dzA(A)  -  lim 

t->o 


A(A  +  tz) 

t 


A  (A) 


The  Radon-Nikodym  density  of  dz  A  with  respect  to  A  is  denoted  /?£,  and  is  called  the 
logarithmic  derivative  of  A  along  z- 


Example  5.6. 

(i)  Let  i/q  be  a  measure  on  Rk  with  Lebesgue  density  p,  supported  and  continuously 
differentiable  on  S  C  Rk.  Then  for  any  a  E  int  (S)  and  b  E  Rk  we  have 

Pb(fl)  =  •  b  =  db\og p(a). 

P(a ) 


(ii)  Let  p0  be  a  Gaussian  measure  on  a  Banach  space  X  with  Cameron-Martin  space 
( E ,  (•,  -)E).  Then  for  any  u  E  X  and  h  E  E  we  have 

0h°(.u)  =  —  <«,  h)E. 

This  follows  from  the  Cameron-Martin  and  dominated  convergence  theorems. 

(iii)  Again  using  the  Cameron-Martin  and  dominated  convergence  theorems,  we  see  that  with 
uq  and  p0  as  above,  for  any  (u,  a)  E  X  x  int (»S)  and  (h,  b)  E  E  x  M.k, 

P(h°X°(“’ «)  =  +  W- 


We  can  use  the  above  example  to  characterise  the  Fomin  derivative  of  our  posterior 
distribution  p,  given  by  (4.2). 

Theorem  5.7.  Assume  that  :  X  x  A  — »  R  is  bounded  measurable  with  uniformly 
bounded  derivative,  and  assume  that  p  is  continuously  differentiable  on  S.  Then  for  each 
C u ,  a)  E  X  x  intOS)  and  (h,  b)  E  E  x  Rk,  we  have 

f3(h,b)(u >  a)  =  ~  d(h,b)®(u>  a )  “  («>  h)E  +  db\og p(a) 

=  ~  d(h,b)I(u,  a). 

Therefore,  (, u ,  a)  is  a  critical  point  of  I  if  and  only  if  8\lk  h)  (w,  a)  =  0  for  all  (h,  b)  £  E  x  M*. 
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Proof.  We  use  result  (2.1.13)  from  [3],  which  tells  us  that  if  A  is  a  measure  differentiable 
along  z  and  /is  a  bounded  measurable  function  with  uniformly  bounded  partial  derivative  dj, 
then  the  measure  /  •  A  is  differentiable  along  h  as  well  and 

We  apply  this  result  with  A  =  /i0  x  z/0,  /=  exp (— <F)/Z  and  z  =  (m,  <z).  Note  that/ satisfies 
the  assumptions  of  (2.1.13)  due  to  the  assumptions  on  <f>.  The  result  then  follows  using 
example  5.6  (iii)  above.  □ 

6.  Numerical  experiments 

In  this  section  we  perform  some  numerical  experiments  related  to  the  theory  above  for  a 
variety  of  geometric  models,  in  the  case  of  the  groundwater  flow  forward  map  introduced  in 
section  2.2.  We  both  compute  minimisers  of  the  relevant  Onsager-Machlup  functional  (i.e. 
MAP  estimators),  and  we  sample  the  posterior  distribution  using  a  state-of-the-art  function 
space  Metropolis-Hastings  MCMC  method.  We  then  relate  the  samples  to  the  MAP  esti¬ 
mators.  From  these  numerical  experiments  we  observe  the  following  behaviour  of  the  pos¬ 
terior  distribution. 


(i)  The  posterior  distribution  can  be  highly  multi-modal,  especially  when  the  parameterised 
geometry  is  non-trivial.  This  is  evident  from  the  sensitivity  of  the  minimisation  of  the 
objective  functional  on  its  initial  state,  and  the  behaviour  of  MCMC  chains  initiallised  at 
these  calculated  minimisers. 

(ii)  When  the  geometry  is  incorrect  the  fields  attempt  to  compensate,  which  presumably 
contributes  to  the  existence  of  multiple  local  minimisers  of  the  objective  functional;  this 
occurs  in  both  the  MAP  estimation  and  the  MCMC  samples.  A  consequence  is  that  many 
of  the  local  minimisers  lack  the  desired  sharp  interfaces.  These  minimisers  could 
however  be  used  to  suggest  more  appropriate  geometric  parameters  for  the  initialisation. 

(iii)  The  mixing  rates  of  MCMC  chains  have  a  strong  dependence  upon  which  local 
minimiser  they  are  initialised  at:  acceptance  rates  can  vary  wildly  when  the  initial  state  is 
changed  but  all  other  parameters  are  kept  fixed.  This  provides  some  insight  into  the  shape 
of  the  posterior  distribution. 

(iv)  Though  often  there  are  many  local  minimisers,  they  can  be  separated  into  classes  of 
minimisers  sharing  similar  characteristics,  such  as  close  geometry.  MCMC  chains 
typically  tend  to  stay  within  these  classes,  which  can  be  observed  by  monitoring  the 
closest  local  minimiser  to  an  MCMC  chain’s  state  at  each  step.  This  suggests  that  the 
posterior  can  possess  several  clusters  of  nearby  modes. 

One  conclusion  we  can  draw  from  the  above  points  is  that  there  are  often  many  different 
geometries  that  are  consistent  with  the  data.  This  is  not  necessarily  an  effect  of  noise  on  the 
measurements,  and  the  effect  may  persist  as  the  noise  level  goes  to  zero,  since  it  is  unknown  if 
these  geometric  parameters  are  uniquely  identifiable  in  general. 

6.1.  Test  models 

We  consider  three  different  geometric  models:  a  two  parameter,  two  layer  model;  a  five 
parameter,  three  layer  model  with  fault;  and  a  five  parameter  channelised  model. 

In  what  follows,  as  in  example  3.4,  we  define  the  negative  Laplacian  with  Neumann 
boundary  conditions: 


19 


Inverse  Problems  32  (2016)  105003 


M  M  Dunlop  and  A  M  Stuart 


Figure  7.  The  definition  of  the  geometric  parameters  a  =  (a1,  a2)  in  model  1. 


Figure  8.  The  definition  of  the  geometric  parameters  a  =  (a1,  a 2,  a3,  a4,  a5)  in 
model  2. 


A  =  -A,  V(A)  =  \  u  e  H2(D)\—  =  0  on  dD,  /  u(x) dx  =  0 

l  &V  JD 

Recall  that  if  u  ~  N(0,  A~a )  with  a  >  d/2,  then  u  is  almost  surely  continuous  [10]. 

6.1.1.  Model  1  ( two  layer).  This  model  is  described  in  example  2.1.  The  geometric 
parameters  a  =  (a1,  a2)  are  defined  as  in  figure  7.  For  simulations,  we  use  the  choice  of  prior 

fi0  =  N(\,  A-1-4)  x  N(-l,  A-1-8), 
vo  =  U( [0,  1])  x  C/([0,  1]). 
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Figure  9.  The  definition  of  the  geometric  parameters  a  =  ( a \  a 2,  a 3,  a4,  a5)  in 
model  3. 


Table  1.  The  relative  error  on  the  data,  when  each  measurement  is  perturbed  by  an 
instance  of  N  (0,  0.012)  noise. 


Model  number 

Mean  relative  error  (%) 

Range  of  relative  errors  (%) 

1 

0.5 

0.02-3.5 

2 

0.9 

0. 1-4.0 

3 

0.3 

0.1-1.0 

6. 1.2.  Model 2  (three  layer  with  fault).  This  model  is  described  in  [16],  where  it  is  labelled  test 
model  1.  The  geometric  parameters  a  =  (a1,  a 2,  a 3,  a 4,  a5)  are  defined  as  in  figure  8,  with  the 
fault  occurring  at  x  =  0.55.  For  simulations,  we  use  the  choice  of  prior 

/i0  =  N( 2,  2A~1A)  x  N( 0,  A"1-8)  x  N(- 2,  2A-1-4), 
v0  =  U(S)  x  £/(S)  x  £/ ([—0.3,  0.3]), 

where  S'  C  [0,  l]2  is  the  simplex  S  =  { (x,  y)  |  0  ^  x  ^  1,  x  ^  y  ^  1}. 

6. 1.3.  Model  3  (channel).  This  model  is  described  in  [16],  where  it  is  labelled  test  model  2. 
The  geometric  parameters  a  =  (a1,  a 2,  a3,  a4,  a5)  are  defined  as  in  figure  9.  Here 
a1,  a2,  a 3,  a4,  a5  represent  the  channel  amplitude,  frequency,  angle,  initial  point  and  width 
respectively.  For  simulations,  we  use  the  choice  of  prior 

/i0  =  N(l,  A~lA)  x  N(-\,  A-1-8), 

u0  =  £/([0,  1])  x  U([tt,  4tt])  x  U ( [ — 7t/4,  tt/4] )  x  f/([0,  1])  x  f/([0,  0.4]). 

For  each  model,  we  fix  a  true  permeability  (u\  cd)  as  a  draw  from  the  corresponding 
prior  distribution,  generated  on  a  mesh  of  25 62  points.  For  the  forward  model,  we  take  the 
coefficient  map  cr(-)  =  exp(-).  We  observe  the  pressure  on  a  grid  (Xi)fh  of  25  uniformly 
spaced  points,  via  the  maps  (2.4)  with  5  =  0.05.  We  add  i.i.d.  Gaussian  noise  N( 0,  72)  to 
each  observation,  taking  7  =  0.01.  The  resulting  relative  errors  on  the  data  can  be  seen  in 
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table  1.  Small  relative  errors  of  this  size  typically  make  the  posterior  distribution  hard  to 
sample  as  they  lead  to  measure  concentration  phenomena;  MAP  estimation  can  thus  be 
particularly  important. 


6.2.  MAP  estimation 

Based  on  the  theory  in  section  5,  we  can  calculate  MAP  estimators  by  minimising  the 
Onsager-Machlup  functional  for  the  posterior  distribution.  We  compute  local  minimisers  of 
the  Onsager-Machlup  functional  using  the  following  iterative  alternating  method. 

Algorithm  6.1. 

(1)  Choose  an  initial  state  (m0,  0o)  C  X  x  A. 

(2)  Update  the  geometric  parameters  simultaneously  using  the  Nelder-Mead  algorithm. 

(3)  Update  each  field  individually  using  a  line-search  in  the  direction  provided  by  the  Gauss- 
Newton  algorithm. 

(4)  Go  to  2. 

The  Nelder-Mead  and  Gauss-Newton  algorithms  are  discussed  in  [30],  in  sections  9.5 
and  10.3  respectively.  Since  we  do  not  update  the  fields  and  geometric  parameters  simulta¬ 
neously,  it  is  possible  that  this  algorithm  will  get  caught  in  a  saddle  point:  consider  for 
example  the  function  /  :  R  x  R  — >  R,  f(x,  y)  =  xy,  at  the  point  (0,  0),  being  minimised 
alternately  in  the  coordinate  directions.  Hence  once  the  algorithm  stalls,  we  propose  a  large 
number  of  random  simultaneous  updates  in  an  attempt  to  find  a  lower  functional  value.  If  this 
is  successful,  we  return  to  step  (2)  of  the  algorithm.  We  terminate  the  algorithm  once  the 
difference  between  successive  values  of  <f>  is  below  TOL  =  10-5.  Calculations  are  performed 
on  a  mesh  of  642  points  in  order  to  avoid  an  inverse  crime. 

To  ensure  that  we  explore  the  support  of  the  posterior  distribution,  we  choose  a  variety  of 
initial  states  (w0,  <z0)  G  X  x  A  for  the  minimisation  such  that  I(uq,  clq)  <  oo  in  the  con¬ 
tinuum  setting.  To  this  end,  we  let  ci{)  be  a  draw  from  the  prior  distribution  z/0,  and  take  u0  to 
lie  in  the  Cameron-Martin  space  of  fi0.  Specifically,  if  a  component  of  u  G  X  has  prior 
distribution  N  (m,  A~a),  we  take  the  corresponding  component  of  u0  to  be  a  draw  from 
N  (m,  A~a~d /2).  Output  of  the  algorithm  is  shown  in  figures  10-12. 

We  first  comment  on  the  minimisers  of  the  Onsager-Machlup  functional  for  model  1. 
Generally  the  geometric  parameters  are  closely  recovered  regardless  of  the  initialisation  state, 
though  there  is  more  variation  in  the  fields.  In  the  simulations  where  the  geometry  is  inac¬ 
curate,  for  example  simulations  7,  17  and  46,  the  fields  can  be  seen  to  be  compensating  by 
forming  a  ‘soft’  interface  where  the  true  interface  is. 

The  minimisers  associated  with  model  2  admit  much  more  variation,  though  it  is  possible 
to  partition  them  into  smaller  subsets  of  minimisers  which  share  similar  characteristics  to  one 
another,  as  mentioned  in  point  (iv)  at  the  beginning  of  the  section.  The  clustering  of  the 
different  minimisers  is  performed  by  eye,  classifying  them  according  to  similar  geometric 
parameters.  Additionally  we  have  an  other  class,  containing  the  minimisers  which  do  not 
appear  similar  to  one  another  nor  appear  to  fit  into  any  other  class.  We  see  later  with  MCMC 
simulations  that  these  states  do  still  act  as  local  maximisers  of  the  posterior  probability. 

The  minimisers  of  the  Onsager-Machlup  functional  for  model  3  show  even  more  var¬ 
iation  than  those  for  model  2,  with  the  geometry  in  half  of  the  minimisers  not  even  being 
close  to  the  true  geometry.  In  the  cases  where  the  geometry  is  drastically  wrong  the  fields 
have  again  attempted  to  compensate.  This  behaviour  is  particularly  evident  in  the  other  class, 
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Figure  10.  (Model  1)  The  true  log-permeability  field  (top),  and  50  local  minimisers 
arising  from  minimisation  initialised  at  draws  from  a  smoothed  prior  distribution. 
Simulation  12  has  the  lowest  functional  value,  with  I  (u^AP,  «map)  =  2847. 
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Figure  11.  (Model  2)  The  true  log-permeability  field  (top),  and  50  local  minimisers 
arising  from  minimisation  initialised  at  draws  from  a  smoothed  prior  distribution. 
Simulation  7  has  the  lowest  functional  value,  with  /(wmAP,  a^A P)  =  2567.  The 
minimisers  have  been  divided  into  classes  based  on  similar  characteristics. 

and  is  echoed  in  the  MCMC  simulations  later.  The  other  class  here  is  much  larger  than  for 
model  2,  though  as  with  model  2  these  states  do  appear  to  act  as  distinct  local  maximisers  of 
the  posterior  probability. 

This  multi-modality  of  the  posterior  distribution  is  not  unexpected.  The  paper  [5]  con¬ 
siders  the  history  matching  problem  in  reservoir  simulation,  in  which  inference  is  done  jointly 
on  both  geometric  and  permeability  parameters  in  the  IC  fault  model.  Though  the  forward 
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Figure  12.  (Model  3)  The  true  log-permeability  field  (top),  and  50  local  minimisers 
arising  from  minimisation  initialised  at  draws  from  a  smoothed  prior  distribution. 
Simulation  20  has  the  lowest  functional  value,  with  /(mmap>  a^Ap)  =  2117.  The 
minimisers  have  been  divided  into  classes  based  on  similar  characteristics. 

map  and  observation  maps  are  different  in  our  model,  we  observe  the  same  clustering  of 
nearby  local  MAP  estimators,  and  increased  multi-modality  as  the  dimension  of  the  parameter 
space  increases.  In  [5]  it  is  observed  that  the  global  minimum  often  does  not  correspond  to  the 
truth,  especially  in  the  presence  of  measurement  noise,  and  so  all  local  minimisers  of  the 
Onsager-Machlup  functional  should  be  sought  before  drawing  conclusions  about  the  per¬ 
meabilities — this  appears  to  be  the  case  in  our  model  as  well.  We  note  that  MCMC  can  be 
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useful  in  identifying  a  range  of  such  minimisers,  in  view  of  the  links  established  in  the  next 
subsection  between  MCMC  and  MAP  estimation. 

6.3.  MCMC  and  local  minimisers 

We  now  observe  the  behaviour  of  MCMC  chains  initialised  at  these  local  minimisers  of  the 
Onsager-Machlup  functional.  We  use  a  Metropolis- within-Gibbs  algorithm  for  the  sampling, 
alternating  between  preconditioned  Crank-Nicolson  updates  for  the  fields,  see  [6]  for  details, 
and  random  walk  metropolis  updates  for  the  geometric  parameters.  Again,  simulations  are 
performed  on  a  mesh  of  642  points  in  order  to  avoid  an  inverse  crime.  105  samples  are  taken 
for  each  chain,  with  the  initial  2  x  104  discarded  as  burn-in.  The  conditional  means  calcu¬ 
lated  from  the  samples  are  shown  in  figures  13-15. 

We  monitor  the  value  that  <f>  takes  along  the  chain  (u^n\  a(ny),  and  compare  it  with  the 
value  <f>  takes  on  the  local  minimisers  (^map*  ^map)-  This  is  shown  in  figures  16-18,  with  the 
horizontal  lines  being  the  different  values  of  ^(^maP’  ^map)-  Note  that  it  makes  no  sense  to 
monitor  the  value  that  the  objective  functional  I  takes  along  the  chain  as  the  fields  almost 
surely  do  not  lie  in  the  corresponding  Cameron-Martin  spaces,  and  so  I  is  almost  surely 
infinite  along  the  chain  in  the  continuum  setting. 

In  addition,  we  monitor  which  minimiser  the  chain  is  nearest  at  each  step,  in  the  per¬ 
meability  space.  Specifically,  we  look  at 

m„  ■■=  argmin  || F(u(n\  a(n))  -  F(u mAP,  aMAp)IU2(r>)>  (6.1) 

i 

where  F  :  X  x  A  — >  L°°(D )  is  the  construction  map  (2.1)  from  the  state  space  to  the 
permeability  space.  We  make  the  choice  of  the  L 2  norm  over  the  L°°  norm  for  the 
permeability  space  to  avoid  over-penalising  incorrect  geometry.  A  selection  of  traces  of  mn 
are  shown  in  figures  19-21.  These  illustrate  that  even  though  some  of  the  local  minimisers  are 
very  far  from  from  the  true  log-permeability,  they  do  indeed  act  as  local  maximisers  of  the 
posterior  probability.  Moreover,  they  show  the  interaction  between  the  different  classes  of 
minimisers  in  the  cases  of  models  2  and  3.  Specifically,  they  show  that  the  MCMC  chains  can 
easily  move  within  these  classes,  but  moving  between  classes  is  more  difficult. 

We  now  discuss  the  above  monitored  quantities,  and  their  relation  to  MAP  estimators,  on 
a  model-by -model  basis.  Despite  the  slight  variation  in  the  fields  of  the  minimisers  from 
model  1,  the  conditional  means  arising  from  the  MCMC  are  nearly  all  identical.  Simulation 
23  stands  out  from  the  rest  due  to  its  slightly  incorrect  geometry — this  effect  can  be  seen  in 
the  trace  plot  of  <f>,  figure  16,  where  the  value  of  <f>  remains  larger  than  the  simulations  started 
elsewhere.  The  traces  of  <f>  for  all  other  initialisations  behave  similarly  to  one  another,  taking 
similar  misfit  values  after  2  x  104  samples.  From  figure  19,  it  can  be  seen  that  the  MCMC 
chains  considered  all  spend  a  lot  of  time  close  to  MAP  estimator  38,  despite  this  not  being  the 
estimator  with  the  lowest  functional  value. 

For  model  2,  typically  the  conditional  means  within  the  different  classes  are  very  similar 
to  one  another.  Classes  A  and  C  resemble  each  other,  and  class  B  has  compensated  for 
incorrect  geometry  with  the  centre  field.  Faults  have  developed  in  class  D,  though  there  is  still 
some  compensation  in  the  field.  The  centre  field  and  a  small  fault  has  appeared  in  class  E,  but 
again  the  fields  are  compensating.  The  geometric  parameters  for  the  permeabilities  in  the 
other  class  remain  relatively  unchanged,  but  the  fields  have  more  freedom  to  attain  a  lower 
misfit  than  in  the  Onsager-Machlup  functional  minimisation  due  to  the  lack  of  regularisation 
term.  Figure  17  shows  evidence  for  a  number  of  local  minima  with  a  large  data  misfit  value  T>, 
with  some  chains  appearing  to  remain  stuck  in  their  vicinity.  The  four  chains  visible  in 
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Figure  14.  (Model  2)  The  true  log-permeability  field  (top),  and  the  conditional  mean 
arising  from  MCMC  chains  initialised  at  the  corresponding  local  minimisers  above.  We 
group  them  into  the  same  classes  as  the  local  minimisers. 


figure  17  (top)  correspond  to  chains  49,  47,  45  and  43,  from  highest  to  lowest  value,  all 
lying  in  the  other  class — despite  their  significantly  incorrect  geometry,  the  corresponding 
MAP  estimators  appear  to  be  genuine  local  maximisers  of  the  posterior  probability. 

In  the  channelised  model,  model  3,  there  is  yet  more  variation  between  local  minimisers. 
Here  the  compensation  effect  by  the  fields  is  even  more  apparent  in  the  conditional  means , 
especially  in  the  other  class.  From  figure  18  it  appears  that  the  local  minima  are  much  sharper 
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Figure  15.  (Model  3)  The  true  log-permeability  field  (top),  and  the  conditional  mean 
arising  from  MCMC  chains  initialised  at  the  corresponding  local  minimisers  above.  We 
group  them  into  the  same  classes  as  the  local  minimisers. 

and  more  sparsely  distributed  than  the  previous  two  models.  Again  the  chains  with  the  largest 
values  were  initialised  at  minimisers  in  the  other  class,  suggesting  the  existence  of  many 
posterior  modes  with  incorrect  geometry. 

The  mixing  of  the  MCMC  chains  varies  heavily  based  on  the  initialisation  points  of  the 
chains:  with  the  same  jump  parameters  for  the  field  and  geometric  parameter  proposals, 
acceptance  rates  vary  largely  based  on  which  minimiser  the  chain  was  started  from.  This 
indicates  that  some  of  the  minima  are  much  sharper  than  others.  This  is  also  evident  from  the 
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Figure  16.  (Model  1)  The  evolution  of  $  as  the  MCMC  chains  progress.  The  horizontal 
lines  represent  the  value  of  each  local  minimiser  under  <f>.  Nearly  all  of  the  simulations 
find  a  small  value  of  <1>  almost  immediately,  but  simulation  23  remains  caught  in  the 
local  minimiser  for  some  time  before  it  follows. 

traces  of  mn  defined  above,  figures  19-21,  especially  in  model  3.  Note  also  from  these  figures 
that  the  nearest  local  minimum  typically  lies  in  the  same  class  as  the  initialisation  state, 
though  jumps  between  classes  are  possible.  Though  not  shown,  in  model  2,  whenever  the 
initial  state  lies  in  class  A,  then  the  nearest  minimiser  always  lies  in  class  A. 

7.  Conclusions  and  future  work 

We  have  made  a  new  contribution  to  the  recently  developed  theory  of  MAP  estimation  in 
infinite  dimensions  [9,  14].  We  link  MAP  estimation  to  a  variational  Onsager-Machlup 
functional.  The  work  is  focused  on  priors  for  piecewise  Gaussian  random  fields,  with  random 
interfaces  parameterised  finite-dimensionally.  Such  fields  arise  naturally  in  applications  such 
as  groundwater  flow  and  EIT,  and  these  are  used  to  illustrate  the  theory  and  numerics.  The 
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Figure  17.  (Model  2)  The  evolution  of  4>  as  the  MCMC  chains  progress.  The  horizontal 
lines  represent  the  value  of  each  local  minimiser  under  <f>.  The  majority  of  the 
simultions  find  a  small  value  of  <1>  almost  immediately,  but  numerous  fail  to  reach  there, 
settling  in  local  minima.  The  shape  of  these  minima  can  be  seen  in  figure  14,  and 
generally  correspond  to  those  in  the  same  class  as  the  initial  state. 
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Figure  18.  (Model  3)  The  evolution  of  as  the  MCMC  chains  progress.  The  horizontal 
lines  represent  the  value  of  each  local  minimiser  under  <f>.  The  majority  of  the 
simultions  find  a  small  value  of  <3>  almost  immediately,  but  numerous  fail  to  reach  there, 
settling  in  local  minima.  The  shape  of  these  minima  can  be  seen  in  figure  15,  and 
generally  correspond  to  those  in  the  same  class  as  the  initial  state. 


32 


Nearest  minimiser  Nearest  minimiser  Nearest  minimiser  Nearest  minimiser 


Inverse  Problems  32  (2016)  105003 


M  M  Dunlop  and  A  M  Stuart 


50 

45 

40 

0  oc 
35 

E 

c  30 
'E 

to  25 
o 

S  20 

z 

15 

10 


5 


123456789  10 

Sample  number  xio4 


Figure  19.  (Model  1)  The  trace  of  mn  as  defined  by  (6.1),  when  the  chain  is  initialised  at 
a  variety  of  minimisers — specifically  numbers  1,  2,..., 8. 
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Figure  20.  (Model  2)  The  trace  of  mn  as  defined  by  (6.1),  when  the  chain  is  initialised  at 
a  variety  of  minimisers — specifically  numbers  7,  14,  21,  28,  35,  39,  46  and  50.  The 
different  classes  are  alternately  shaded. 
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Figure  21 .  (Model  3)  The  trace  of  mn  as  defined  by  (6.1),  when  the  chain  is  initialised  at 
a  variety  of  minimisers — specifically  numbers  7,  13,  21,  33,  38,  47,  48  and  49.  The 
different  classes  are  alternately  shaded. 
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work  opens  up  several  new  avenues  for  investigation.  A  major  theoretical  direction  is  to  fully 
reconcile  the  approaches  in  [9,  14];  the  work  in  this  paper  suggests  that  this  may  be  possible. 
On  the  applications  side  an  important  new  direction  would  be  to  consider  problems  in  which 
the  geometric  parameters  are  no  longer  independent  from  the  fields  a  priori.  A  possible 
extension  could  be  to  treat  the  geometric  parameters  as  hyperparameters  for  the  fields  under 
the  prior.  This  would  allow,  for  example,  the  fields  to  have  specific  boundary  conditions  at  the 
interfaces,  which  may  be  more  physically  appropriate  in  some  contexts.  A  related  hierarchical 
model  was  considered  in  [29],  in  which  prior  samples  were  piecewise  white;  this  could  be 
extended  to  allow  for  spatial  correlations  in  the  continuum  setting.  Computationally  an 
exciting  direction  is  to  build  upon  definitions  of  MAP  estimators  to  develop  hybrid  algorithms 
which  fully  exploit  local  minimiser  structure  of  the  Onsager-Machlup  functional 
within  MCMC. 
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Appendix 

In  this  appendix  we  provide  proofs  of  the  results  given  in  the  paper. 


A.  1.  Results  from  section  2 

Before  we  prove  lemma  2.5  we  require  the  following  lemma. 


Lemma  A.1 .  Let  ( X ,  T,  p)  be  a  measure  space  and  f  E  Ll  ( X ,  T,  p).  Let  Bn  C  T  be  a 
sequence  of  measurable  subsets  of  X  with  p  ( Bn )  — >  0  as  n  — >  oo.  Then 


/, 


f(x)p(dx)  — >  0  as  n  — >  oo. 


Proof.  Write  fn  (x)  =  /(x)JL^(x).  We  have  that  fn  — >  0  in  measure:  for  any  6  >  0, 

M({X  G  X  I \fn(x)\>6})  <  M({x  G  X  I \fn{x)\  *  0})  <  MC Bn)  -►  0. 

Now  suppose  that  H/J^i  does  not  tend  to  zero.  Then  there  exists  6  >  0  and  a  subsequence 
(fnM  i  such  that  ||/  ||  j}  ^  8  for  all  k  1.  This  subsequence  still  converges  to  zero  in 
measure,  and  so  admits  a  further  subsequence  that  converges  to  zero  almost  surely.  We  can 
bound  this  subsequence  above  uniformly  by  f  and  so  an  application  of  the  dominated 
convergence  theorem  leads  to  a  contradiction.  The  result  follows.  □ 

Proof  of  lemma  2.5.  Showing  that  1Z  is  well-defined  is  equivalent  to  showing  that  PDE 
(2.3)  has  a  unique  solution  for  all  ( u ,  a)  E  X  x  A.  Since  ua  G  L°°(D)  it  is  bounded,  and  so  by 
the  continuity  and  positivity  of  a  there  exist  nm{n,  ftmax  >  0  with  ftmin  <  cr(ua)  <  ftmax .  The 
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associated  bilinear  form  is  hence  bounded  and  coercive.  The  right-hand  side  can  be  seen  to  lie 
in  H~l{D)  since  G  G  Hl{D)  and  <j{ua)  <  Km ax,  and  so  a  unique  solution  exists  by  Lax- 
Milgram. 

(i)  In  its  weak  form,  the  PDE  (2.3)  is  given  by 

J  &(ua)\7qua  •  V(/?  =  /((/?)  —  J  crVG  •  V(/9  for  all  p  G  V. 

Taking  p  =  qu  a  we  deduce  that 

Kmin(w,  a)\\\/qua\\2Ll  ^  (j{ua)Wqua  ■  Vqu  a 

=  /(<?„,«)  -  f  cr(ua)VG  ■  Vq 

Jd 

<ll/llv*KJv  +  lk(w0)|HIVG||£*||V^JL* 

and  so  we  have  the  estimate 

II  PuJv  ^  \\%aWv  +  l|G||v 

<(||/||y*  +  ||o-(Ma)||z,“||G||v)/Kmin(M,  a)  +  ||G||V - 

(ii)  Let  u,  v  £  X  and  a  £  A.  Then  pua  —  pva  satisfies  the  PDE 

f-  V  •  (<7(ua)V(pua  -  Pva ))  =  V  •  ((a(ua)  -  a(va))Vpva)  in  D 
\  Pu,a  ~  Pv,a  =  0  0n  dD ■ 

Setting  v,  a)  =  Kmin(u,  a )  A  ftmin(v,  <2),  we  see 

«*(«>  v,  a)\\W(pua  -  pvJ\\2Li  <  Jd  ct(m«)|V( pua  -  pva) |2 

=  f  ( a(ua )  -  o-(va))V(p  -  p  )  •  Vp 
<  \\a(ua)  -  a(va)\\Loo\\V(pua  -  PVj„)||l2  IVpv,JIl2 

and  so  by  (i), 

II  Pu,a  ~  Pvjv  <  \\Pv,a\\v\W(ua)  -  CT  (va)\\L°»  /  K*(u,  V,  a) 

<  \\a(ua)  -  a(va) ||L“ 

X  (dl/llv*  +  lk(wfl)||HIGllv  )/«*(“>  a)2  +  \\G\\V  /  k*(u,  a)). 

Using  that  the  At  are  disjoint  gives  that 


\\a(ua)  -  <7(vfl)||£»  = 


N  \  (  N 

d— 1  /  V=1 


=  \W(u k)  -  a(vk) ||£» 

for  some  k  =  k(a).  Now  suppose  that  \W\\x,  ||v||x  <  r.  Then  the  C1  property  of  a  yields 
\\a(uk)  -  a(vk) ||£»  sC  max|cr7(f)|  •  || uk  -  <  max|cr7(t)|  •  || u  -  v||x. 


Finally  we  deal  with  the  kJ  terms: 
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k*(m,  v,  a )  7  =  [(essinf  eMfl(x))  A  (essinf  ev*(x))]  7 

jcGZ)  xeD 

<  (min  cr  (0  A  min  cr  (f  ))-7 
|r|^r  |r|^r 

=  (mincr(O)-7  • 

1*1 

We  bound  the  ||o'(waE)||£°°  term  similarly.  Putting  the  above  bounds  together,  we  have 

«)  -  ^O,  a)\\v  =  \\pua  -  pvJv 

^  max(min0-(O)_;' (||/||y*  +  ||G||y  (max<x(f)  +  1)) 

j=  1,2  |/|<r  |f|<r 

x  max|a'(0l  •  \\u  —  v||x 

\t\<r 

=  L(r)\\u  -  v||x • 

Note  that  the  constant  L(r)  is  uniform  in  a. 

(iii)  We  use  a  similar  approach  to  the  previous  part.  Given  u  G  X  and  a,  b  e  A,  the 
difference  pu  a  —  pub  satisfies 

f-V  •  (cr (ua)W (pua  -  pub ))  =  V  •  ((<x(wa)  -  CT(Mfc))VpMfc)  in  D 
[  Pu,a  -  Pu,b  =  0  °ndD 

which  leads  to  the  bound 

Kt(M,  a,  b)\\V(pua  -  pub) ||22  sj  cr(Ma)|V(/7„a  -  pM;,)|2 

=  f  (a(ua)  -  cr{ub))\7{p  -  p  b)  ■  Wp  b 

Jd 

<  llV<Aa  -  JP«,fo)Hz.2  IKc’"  (w°)  -  <T(w&))VpHjJL2, 

where  a ,  &)  =  a)  A  ^mjn  (u,  b).  It  follows  that 

l|p„>a  -  <  ||(^(Ma)  -  cr(ub))S/pujy/^(u,  a,  b). 

Again  by  the  disjointness  of  the  At  and  the  C1  property  of  cr, 


(a  (ua)  -  cr(ub))Vpu  JL2  = 


j  ~  °^uilAi{b)yjVpUtb 

V 

^2(a(UilMa))  -  a(UilMb)))Vpub 


^  211  (a(UilMa))  ~  v(UilMb)))VpuJ?\\L2 

i=  1 
N 

<  2  max  l^'(0l  •  IIMa^)  -  Ma^)I  V^||L2 


^2  max  1^(01  •  Halloo  || ^M^AAtib^Pubh2 
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since  \lA  —  15|  =  Haas-  Now  as  before  we  can  bound  ^  l: 

v,  a)~l  =  [(essinf  ewa(x))  A  (essinf  eM^)]_1 

xeD  xeD 

min  cr(t)  A  min 

|r|^max  || Mj | loo  |?|^max  HmjIIoq 

min  cr(O)-1. 

\t\^\\u\\x 


Putting  the  above  bounds  together,  we  have 

\\K(u,  a)  -  K(u,  b)\\v  =  \\pu  a  -  pub\\v 


<(  min  a(t))  1  ^  max  \a'(t)\  ■  \\ui\\oc\\lAi(a)AA.(b)\7pu  b\\L2 
UKIMU  ,-=il'lsS|l“.-|loo 

N  (  „  y/2 

min  cr(/))  1  J2  max  k'WI  •  ll«/||oo  /  I &|2  )  . 

IfKWx  “1m<||k,IL  {JM^AAiib)  u’°  ) 


The  right  hand  goes  to  zero  as  each  \At (a) AAt (b)\  — ►  0  by  lemma  A.l,  since 
\Vpub\  G  L2(D ),  and  so  the  continuity  of  7 Z(u,  •)  follows  from  the  assumed  continuity  of 
the  maps  At.  □ 


Proof  of  proposition  2.7. 

(i)  Theorem  2.3  in  [17]  tells  us  that  the  mapping  from  the  conductivity  to  the  weak  solution 
of  (2.5)  is  Frechet  differentiable  with  respect  to  the  supremum  norm,  and  hence  locally 
Lipschitz.  Note  that  the  mapping  from  the  solution  to  the  boundary  voltage 
measurements,  (v,  V)  i— >  V ,  is  smooth,  and  the  assumptions  on  a  imply  that  it  is 
Lipschitz.  It  hence  suffices  to  show  that  the  mapping  u  i— ►  F  (u,  a)  is  Lipschitz  for  each 
a  G  A.  Let  u,  v  G  X  and  a  G  A,  then 

N 

II F(u,  a)  -  F(v,  a)||oo  ^  ^|k  -  ^  C  \\u  -  v\\x 

i=  1 

and  the  result  follows. 

(ii)  By  corollary  2.8  in  [11]  and  the  continuity  of  a,  it  suffices  to  show  that  an  — >  a  in  A 
implies  that  F  (u,  an)  — >  F  (u,  a)  in  measure.  For  any  p  G  (1,  oo)  we  have  that 

| F(U,  an)  -  F(U,  a)\p  dx  <  \ui\PlAi(an)AAi(a)  dx 

<Elkll£,  •  \Ai(an)AAi(a)\ 

i=  1 

From  the  assumed  continuity  of  At  (•)  it  follows  that  F  (u,  an )  —>  F  (u,  a )  in  Lp  for  any 
p  G  (1,  oo),  and  hence  in  measure.  □ 


A. 2.  Results  from  section  4 

Proof  of  theorem  4.2. 

(i)  We  first  claim  that  the  assumptions  on  <f>  mean  that  <F(-,  •;  y)  :  X'  x  A'  — >  R  is 
continuous  for  each  y  G  Y.  Fix  y  G  7  and  (u,  a)  EX'  x  A'.  Choose  any  approximating 
sequence  (wn,  an)n^\  C  X'  x  A'  such  that  ( un,  an)  — >  (u,  a).  Then  the  assumptions  on 
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the  norm  on  X  x  A  means  that  || un  —  u\\x  —■ -  0  and  \an  —  a\  —>  0.  Letting 
r  >  max{||n||x,  supn  ||n„||x},  we  may  approximate 

|$(w„,  an-  y )  -  $(w,  a;  y)|  <  |$(w„,  a„;  y)  -  $(n,  a„;  y)| 

+  | <£>(n,  an\  y)  -  $(u,  a;  y)| 

<  M3(r,  an,  y)\\u„  -  u\\x  +  |$(w,  a„;  y)  -  <£>(w,  a;  y)| 

<  (sup M3(r,  ak,  y))  •  \\un  -  u\\x  +  |$(m,  a„;  y)  -  <£>(w,  a;  y)|, 

jfcsN 

where  the  supremum  is  finite  due  the  continuity  of  M3  in  its  second  component.  Since 
is  also  continuous  in  its  second  component,  we  see  that  the  right-hand  side  tends  to  zero 
as  ( un ,  an )  — >  ( u ,  a). 

Now  as  <f>(-,  •;  y)  :  X'  x  A'  — >  M  is  continuous  and  (/i0  x  v o)(X'  x  A')  =  1,  $(•,  •;  y)  is 
/x0  x  U) -measurable.  Setting  Z  =  X'  x  A',  we  can  consider  <f>  :  Z  x  F  — »  M.  This  is  a 
Caratheodory  function,  and  it  is  known  that  these  are  jointly  measurable,  see  for  example  [1]. 
We  conclude  that  T>  is  /i0  x  z/0  x  Q0  measurable. 

(ii)  We  first  show  Z(y)  is  finite.  Since  fi0  is  Gaussian,  by  Femique’s  theorem  there  exists 
a  >  0  such  that 

f  exP (a  \\u\\2x)n0(du)  <  °o. 

Then  using  assumptions  4.1  (i),  we  have  the  lower  bound 

$(m,  a;  y)  >  M| (a)  -  a  ||w||| 

from  which  we  conclude  that  Z(y)  <  oo. 

Now  fix  r  >  0.  Let  y  G  F  and  take  (w,  a)  e  X'  x  A'  with  max{  ||m||x,  Ml  <  r.  Then 
we  have  by  the  local  Lipschitz  property 

|$(«,  a;  y)|  <  M3(j,  y)||w||x  +  |$(0,  a;  y)|  <  M3(r,  a,  y)r  +  |<E>(0,  a;  y)|. 

Using  the  continuity  of  and  M3  in  a ,  we  can  maximise  the  right-hand  side  over 
\a\  <  r  to  deduce  that 

I $(w,  a ;  y)|  <  ^(r,  y). 

Thus  T>(-,  •;  y)  is  bounded  on  bounded  sets. 

Now  we  can  proceed  as  in  [10].  Using  that  (/x0  x  vq)(X'  x  A')  =  1,  we  have  that 
Z(y)  =  /  exp(-<I>(w,  a;  y))/i0(du)i/0(da). 

Jx'x  A' 

Set  £'  =  (X'  x  A')  n  B ,  and  set 

=  sup{max{ ||m||x,  M}  |(w,  cl)  G  Z?'}- 
We  deduce  that 

sup  <f>(w,  a;  y)  ^  K(R,  y)  <  oo 

(u,a)eB' 

and  so 

Z(y)  >  f  exp (~K(R,  y))/i0(du)i/0(da)  =  exp(-K(R,  y))(/n0  x  z/0)(B')  >  0. 

Jb' 

Hence  the  measure  is  well-defined. 
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(iii)  The  well-posedness  of  the  posterior  is  proved  in  virtually  the  same  way  as  theorem  4.5 
in  [10].  □ 


A. 3.  Results  from  section  5 

Throughout  this  section,  for  6  >  0  and  ( u ,  a)  £  X  x  A,  we  will  denote 
J6(u ,  a)  =  p(B6(u,  a)).  To  prove  theorems  5.1  and  5.2,  we  first  require  two  lemmas. 


Lemma  A.2.  Let  (zq,  <q),  ( U2 ,  a 2)  £  E  x  intOS).  Then 


(Mo  x  v0)(Bs(uh  a,)) 

lim — - - - - 

(fi0  x  v0)(Bb(u2,  a 2)) 


:  Q2 


P(aj) 
p(a2 ) 


=  exp  (/  (u2)  +  K(a2 )  —  /(zq)  —  ^f(<q)). 


Proof.  We  adapt  the  proof  of  proposition  18.3  in  [26]  to  first  show  that 

(/x0  x  v0 )(B6(uh  <q))  ~  e“illMllS(/i0  x  v0)(Bs(0,  <q))  as  6  j  0. 

The  first  half  of  the  proof  is  almost  identical  to  that  in  [26],  though  some  care  must  be  taken 
since  we  cannot  (< a  priori )  separate  the  integrals  over  balls  in  X  x  A  into  products  of  those 
over  balls  in  X  and  A.  Using  the  Cameron-Martin  theorem  we  see  that 

(/x0  x  z/0  a{))  =  e— illMl  I  &  f  /x0  (dw)  ^0  (d^) . 

Jb6(  0,a{) 

Since  (zq,  —u)E  =  —  (zq,  zz)£  and  #^(0,  <q)  is  symmetric  about  0  £  A,  it  follows  that 

f  q(ui>u)e  /xn(dzz)z/0(d<2)  =  f  —  (e^Ml’M^  +  e_^1’M^)/xn(dz2)z/0(da) 
JB6(0,ai)  Jb6( 0,fli)  2 

>  (Mo  x  v0)(B6(0,  ai)) 

which  gives  the  inequality 

(Mo  x  t'oX^wi,  fli))  X  e4IMl(Mo  x  i/0)(fl*(0,  a0).  (8.1) 

For  the  opposite  bound,  we  write  (zq,  •)#  as  the  sum  of  two  functionals  zc  and  on  E.  We  aim 
to  choose  Zc  to  be  continuous  on  E ,  and  ‘small’  in  some  sense.  Then  we  have  that 


(/x0  x  uq)(B6(ui,  ai))  =  e  2IKIII  f  eZc(M)+z^M) /xo(dz/)z/0(d(2) 

d^(0,Gl) 


sup  Zc  (zz) 


<  exp  — 


2  (M,a)G51(0,fli) 

(Mo  x  vo)(B6(0,  «0)  +  f  (eZs(u)  ~  l)/-i0(du)iA)(da) 

OBs(  O.aO 


where  we  have  used  the  linearity  of  zc  to  extract  5  from  the  supremum.  As  in  [26],  using  a 
result  from  [34],  a  special  case  of  the  Gaussian  correlation  conjecture,  it  follows  that  for  any 
C  £  M  and  any  convex  set  B  C  X  symmetric  about  0, 

MoO B  D  {uex  \  Iz,(w)|  >  C})  <  Mo(fi)Mo(kXOI  >  C). 
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Then  for  any  increasing  function  ip  :  M+  — >  M+,  one  has 
I  ip(\zs(u)\)fj,0(du)u0(da)  =  /  ^(ks(«)l)1ls5(0,fll)(M,  a)fj,0(du)u0(da) 

J-^OO 

(fj-o  x  vo)({(u,  a)  e  B6( 0,  ai)  \  tp(\zs(u)\)  >  t})dt 

0 

J-»oo 

Oo  x  vo)({(u,  a)  G  Bs( 0,  aO  |  |zs(m)|  >  ^_1  (0 } )dr 

0 

r*00  n 

=  Jo  JaM0({»  e  x  I  («,  a)  G  B6( 0,  ax),  \zs(u)\>tp~l(t)})iy0(da)dt 

J~*00  r* 

/  H0({u  ex  \  (w,  a)  G  B^co,  ai)})/i0(ks(OI  >  <^_1  W)^o(da)dr 

0 

=  fo  Mo(ksO)l  >  0({u  G  X  I  (w,  a)  G  B6(0,  ai)})^o(da)idf 

(M0xz/o)(^^(0,ai)) 

J-»oo 

M0(l^(-)l>^_1(0)df 

0 

J-»oo 

Mo^dzAOl)  >  *)dr 

0 

=  (Mo  x  ^o)(66(0,  fli ))J  <pQzs(u)Dfj,0(du). 

Choosing  <p(-)  =  exp(-)  —  1  in  this  formula  gives 

f  (el^(M)l  -  l)/j,0(du)is0(da)  <  (/x0  x  i/0)(55( 0,  01))  f  (e|z*(M)l  -  l)p0(du). 

Jb6(  0,ax)  Jx 

The  space  of  linear  measurable  functionals  on  E ,  which  contains  (u\,  -)E,  is  the  L 2  closure  of 
E*.  Thus  for  any  5  >  0,  the  functionals  zc,  zs  can  be  chosen  in  order  that  the  first  of  them  is 
continuous  and  the  second  of  them  satisfies  the  inequality 

f  (elz,(w)l  _  1)^ 0(dw)  <  £. 

Jx 

It  follows  that  for  each  5  >  0  we  have 
(Mo  x  vo)(B6(uh  a,)) 


<exp|-^ 


E  +  6  ■  sup  Zc(u)  |(/x0  x  v0)(Bb( 0,  ai))(l  +  e). 

(u,a)eBl(0,ai) 


(8.2) 


Since  balls  are  bounded,  5  >  0  is  arbitrary  and  zc  is  continuous,  we  can  combine  (8.1)  and 
(8.2)  to  deduce  that  there  exists  M  >  0  such  that 

e-!ll“ii(Mo  x  is0)(Be( 0,  ax))  ^  (/i0  x  v0)(Bs(ux,  ax)) 

<  e-ill^lll+^^o  x  v0)(Bs( 0,  aO). 

Now  looking  at  the  ratio  of  measures  we  see 


..  (Mo  x  vo)(B6(ux,  aO)  1|lu  l|2  I Mu  ,,2  (Mo  x  ^o)(B#(0,  ax)) 

lim — - - - - =  QiWmE-imWE  .  iim — h - - - . 

«I0  (fji0  x  vq)(B°{u2,  a2))  «|0  (fjb0  x  i/q) (B° (0,  a2)) 


We  now  deal  with  the  geometric  parameters.  Let  a*  E  int(5')  so  that  p  is  positive  in  a 
neighbourhood  of  <2*  (we  may  take  a *  =  a\  or  a2  since  we  assume  they  lie  in  int(S')).  Then 


42 


Inverse  Problems  32  (2016)  105003 


M  M  Dunlop  and  A  M  Stuart 


Op  X  vq)(Bs(P,  ad) 

(Mo  x  vo)(.B6(0,  a2)) 


Jp(a)p0(du)da 

Bs(0,ai) 

Jp  (a)  p0(du)da 

B6(0,a2 ) 

Jp(a  +  a\  —  a*)  pAdu)da 

B6(0,a*) _ 

Jp(a  +  #2  —  aA)  pAdu)da 

fl5(0,a*) 


f 

OB6(0,a*) 


p(a  a,\  —  a*) 
P(a) 


/x0  (dw)  z/q  (da ) 


f 

JB6(0,a*) 


p(a  +  a2  —  a*) 
p(a) 


p0  (d  u)  uq  (da) 


For  sufficiently  small  6  both  of  the  integrands  are  continuous.  A  mean- value  property  hence 
holds  for  the  integrals,  and  so  we  may  divide  both  the  numerator  and  denominator  by 
(p o  x  isq)(B8(0,  a*))  and  take  limits  to  obtain 

p  (a  +  a\  —  a*)  j 

(M0  x  i/0)(B«(0,  ax))  m(o)  U=a* 

lim  — - - j - = - r - 

«to  (m0  x  v0)(Bb( 0,  a2))  P(«  +  a2~  a*), 

M(a) 

=  l(fli) 

p(a2) 

We  conclude  that 

Um  (^o  X  ^o)(^(«i.  «i))  _  e|||«2|||-|IMII  .  ZM 
(m0  X  v0)(B6(u2,  a2))  p(a2) 

=  exp  (J  (U2)  +  K(a^  —  J  (u\)  —  K(a\)). 


□ 


Lemma  A.3. 


Let  f,g:A^Mbe  continuous,  and  (u\,  a\ ),  (u^,  <22)  £  E  x  intOS).  TTien 


Hm  °(da) 
#i°  /^(M2,a2)^(a)//°(dfl)^0(da) 


el  II. «2  III  -j  II  ■ »illl 


l(fli) 

P(«2> 


/(fll) 

g(a2) 


Proof.  Let  £  >  0.  Then  by  the  continuity  of  /  and  g,  and  the  assumption  on  the  norm  on 
X  x  A,  there  exists  S  >  0  such  that 

(/(at)  ~  fKMo  x  v0)(B6(ui,  oQ)  <  (»!,«,) ^ ^ (dH> 

(g(«2)  +  £)(Mo  x  vo)(Bs(u2,  a2 ))  f  g(a)M0(dw)i'o(da) 

dB6(u2,a2) 

(f(a  1)  +  fKMo  X  r/0)(gg(Mi,  aQ) 

""  (gfe)  -  e)(M0  x  vo)(B6(u2,  a2)) 

The  result  now  follows  by  the  previous  lemma.  □ 
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Proof  of  theorem  5.1.  Let  (uh  at),  (u2,  a2)  G  E  /  int(,S).  The  case  <l>  =  0  is  the  result  of 
lemma  A.2.  Now  proceeding  analagously  to  [9], 


JHuu  fli) 

J\u2,  a2) 


X 


Bb(uhax) 


exp(— <2))//0(dw)^o(d^) 


x 

X 


Bb(u2,a2 ) 

r 

B6(uhai ) 


X 


Bb  (u2,a2) 


exp(  —  a) )  /x0  (du)  (da) 

exp(— <b(w,  a )  +  $(mi,  «i))exp(— ai))/x0(dw)z/o(d«) 
exp(-<£>0,  a)  +  $(m2,  02))exp(-$(ii2,  a2))n0(du)v0(da) 


Using  assumptions  4.1(iv),  we  have  that  for  any  ( u ,  a ),  (v,  b)  G  X  x  A, 


|$0,  a)  -  $0,  fc)|  <  Af3(r,  tf)||w  -  v||x  +  l$0,  a)  -  <F(v,  fc)| 
where  r  >  max{||w||x,  ||v||x}.  Now  set 

Li  =»  max  Af3(||Mi||x  +  6,  a ), 

M<N+£ 

L2  =  max  M3(||m2||x  +  <5,  a), 

|a|<|a2|+^ 


which  are  finite  due  to  the  continuity  assumption  on  M3.  Then 

^Ml’  ^  etf(Li+L2)e-$(«i,ai)+$(«2,fl2)) 

JS(u2,  a2)  ' 


Jexp(|f(»i,  a)  —  $(mi,  ai)|)/x0(dn)t'o(da) 

Bs(u 


r 

Jb6 


exp(— |<T(w2,  a)  —  <b(w2,  ^2)|)/x0(dw)x/o(d<2) 


'  Bb(u2,a2) 

Note  that  both  integrands  are  continuous  in  a,  and  so  we  may  use  the  previous  lemma.  Taking 
lim  sup^0  of  both  sides  gives 

J6(uh  a{) 


lim  sup 


sio  J\u 2,  a2) 


<  q-I  (uhax)+I  (u2,a2) m 


A  similar  method  gives  that  the  lim  inf^o  is  bounded  below  by  the  RHS  and  so  we  have  that 
for  any  (uh  a2),  (u2,  a2)  £  E  x  int(5), 

J6(uh  a{)  /(M2ffl2)_/(Ml,fll) 

6[0  j\u2 ,  a2) 


Noting  that  /  is  continuous  on  E  x  S,  we  see  that  I  agrees  with  the  Onsager-Machlup 
functional  on  E  x  S.  Finally  note  that  I  (u,  a)  —  oo  on  ( X\E )  x  A  and  E  x  (A\S).  □ 


Remark  A.4.  Note  that  the  limit  above  is  independent  of  the  choice  of  norm  used  on  the 
product  space  X  x  A  when  referring  to  the  balls.  If  we  use  the  norm  given  by 

||(x,  <a)||  =  max{  ||x||x,  |a|} 
then  we  have  that 

B6(u ,  a)  =  B6(u)  x  Bs(a) 
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and  so  may  deduce  that,  for  any  choice  of  norm  on  X  x  A, 

(Mo  x  v0)(Bs(uu  a0)  (Mo  x  voWHui)  x  ^(«i)) 

<5|0  (fj,Q  x  v0)(Bd(u2 ,  a2))  No  (Mo  x  vo)(b\u2)  X  Bd(a2)) 

-  iim  .  ^o(^(Qi)) 

no  /i0(B6(u2))  v0(B6(a2)) 

This  will  be  useful  later  for  separating  integrals. 

Proof  of  theorem  5.2.  We  follow  the  idea  of  the  proof  of  theorem  5.4  in  [33],  which  is 
based  on  [7,  19],  and  first  show  /=<F  +  /  +  /fis  weakly  lower  semicontinuous  on  E  x  S. 
Let  (un,  an )  — ^  (m,  a)  in  E  x  S.  Since  S  CRk,  weak  convergence  of  the  second  component  is 
equivalent  to  strong  convergence.  Since  fi0(X)  =  1,  E  is  compactly  embedded  in  X  and  so 
un  — >  u  strongly  in  X.  In  the  proof  of  existence  of  the  posterior  distribution  we  showed  that  <f> 
is  continuous  on  X  x  A,  and  so  we  deduce  that  &(un,  an )  — >  <F(w,  a).  Hence  <f>  is  weakly 
continuous  on  E  x  S.  The  functional  J  is  weakly  lower  semicontinuous  on  E  and  K  is 
continuous  on  S ,  and  so  I  is  weakly  lower  semicontinuous  on  E  x  S. 

Now  we  show  I  is  coercive  on  E  x  S.  Since  E  is  compactly  embedded  in  X  there  exists  a 
C  >  0  such  that 

1 1 it | ^  C  \\u\\E. 

Therefore  by  assumption  4.1(i)  it  follows  that,  for  any  5  >  0,  there  is  an  M(e)  G  I  such  that 
I(u,  a)  M(e)  +  [I  -  Ce J  ||u|||  +  K(a). 

Since  K  is  bounded  below4  by  —log  ||p||oo>  we  may  incorporate  this  into  the  constant  term 
M{e): 

/(«,  a)  ^  M(e)  +  -  Cej  |m|||. 

By  choosing  e  =  1/4C,  we  see  that  there  is  an  M  G  I  such  that,  for  all  (w,  a)  G  E  x  S', 
/(A,  a)  ^  ||u|||  +  M 

which  establishes  coercivity. 

Now  take  a  minimising  sequence  ( un,  an )  such  that  for  any  5  >  0  there  exists  an 
N\  =  N\  (8)  such  that 

M  <  /  ^  I(un,  an )  ^  /  +  <5,  \/ n  ^  N\. 

From  the  coercivity  it  can  be  seen  that  the  sequence  (un,  an)  is  bounded  in  E  x  S.  Since 
E  x  5  is  a  closed  subset  of  a  Hilbert  space,  there  exists  (u,  a)  G  E  x  S  such  that  (possibly 
along  a  subsequence)  ( un ,  ^  (u,  a)  in  E  x  S.  From  the  weak  lower  semicontinuity  of  /it 

follows  that,  for  any  6  >  0, 

/  ^  I(u,  a)  ^  /  +  6. 

Since  8  is  arbitrary  the  first  result  follows. 

Now  consider  the  subsequence  ( un ,  an)  — ^  (u,  a).  The  convergence  of  an  —>  a  is  strong, 
so  all  that  needs  to  be  checked  is  that  un  — ►  u  strongly  in  X.  This  follows  from  exactly  the 

4  Recall  in  section  3.3  we  assumed  p  to  be  continuous  on  the  compact  set  S,  and  hence  bounded. 
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same  argument  as  in  the  proof  of  theorem  5.4  in  [33]  (taking  a  as  the  second  parameter  in  I 
and  <f>)  and  so  the  second  result  follows.  □ 

Before  proving  theorem  5.3  we  first  collect  some  results  on  centred  Gaussian  measures 
from  [9],  specifically  lemmas  3.6,  3.7,  and  3.9.  For  mGI,  let 

Jo(“)  = 


Proposition  A.5. 

(i)  Let  8  >  0  and  u  E  X.  Then  we  have 

jSfu)  <s  ce-f<ll"H*-«>2, 

J6o(  0) 

where  c  =  exp  f  y  < 82^j  and  a\  is  a  constant  independent  of  z  and  8. 

(ii)  Suppose  that  u  E,  (u8)^ o  C  X  and  u 8  converges  weakly  to  u  G  X  as  8  j  0.  Then  for 
any  e  >  0  there  exists  6  small  enough  such  that 

J6o(u8) 

J6o(  0) 

(iii)  Consider  ( u 8)s>o  G  X  and  suppose  that  u8  converges  weakly  and  not  strongly  to  0  in  X 
as  8  l  0.  Then  for  any  e  >  0  there  exists  6  small  enough  such  that 

J^OZ)  <  £ 

J6o(  0) 


Proof  of  theorem  5.3. 

(i)  We  first  show  ( u 8 ,  a8)  is  bounded  in  X  x  A.  The  boundedness  of  the  second  component 
is  clear  since  S  is  bounded,  so  it  suffices  to  show  that  ( u 8)  is  bounded  in  X.  This  is  proved 
in  the  same  way  as  in  theorem  3.5  in  [9]. 

In  the  proof  of  existence  of  the  posterior  measure,  theorem  4.2,  we  show  that  if  r  >  0  and 
\\u\\x,  \a\  <  r,  then  there  exists  K(r)  >  0  such  that  <T(w,  a)  <  K{r).  Letting 
c  =  >  0,  it  follows  in  the  same  was  as  [9]  that,  given  any  a  E  S,  for  8  <  1 

we  have 

Js0(us,  a)  ^  cJq(0,  a). 

Suppose  that  ( u 8)  is  not  bounded  in  X  so  that  for  any  R  >  0  there  exists  8R  such  that 
\\u?*\\x  >  R,  with  8r  —>  0  as  R  — >  cxo.  Then  the  above  bound  says  that 

Joiu?,  a)  =  ^0(g^(M6))  _  I/Q (Bs(a))  > 

JSo( 0,  a )  V0(B6m  '  MB6(a))  " 

This  contradicts  proposition  A.5(i)  above.  Therefore  there  exists  R,  8R  >  0  such  that 
II (u8,  a8) ||xxA  <  R  for  any  <5  <  8R. 

Hence  there  exist  (u,  a)  G  X  x  A  and  a  subsequence  of  (u6,  a8)o<s<sR  which  converges 
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weakly  in  X  x  A  to  (u,  a)  as  8  j  0.  For  simplicity  of  notation  we  still  call  this 
subsequence  ( u 8 ,  a8). 

We  now  show  that  ( u 8,  a8)  converges  strongly  to  an  element  of  E  x  S.  We  first  show  that 
C u ,  a)  E  X  x  S. 

Note  that  any  limit  point  of  a 6  must  lie  in  S.  Suppose  it  did  not,  and  a  limit  point  was 
a*  S.  Then  there  exists  8^  >  0  such  that  along  a  subsequence  converging  to  a *,  8  <  8^ 
implies  a6  0  S  since  S  is  closed.  For  8  <  |dist(<z*,  S)  A  8^  we  then  have 
B8(a8)  H  S  =  0.  In  particular  vq(B8  (a8))  =  0  for  all  such  which  in  turn  implies 
J6(u,  a6)  =  0  for  any  u  e  X  contradicting  the  definition  of  a 6 .  It  follows  that  we  must 
have  a  E  S. 

We  need  to  show  u  E  E.  From  the  definition  of  (u8,  a6)  and  the  bounds  on  <f>  we  have  for 
8  small  enough  and  some5  a  close  to  1, 

, ,  jv, o),  t~uL,*i‘°mLnnm 

1  ^  7 -  ^  OL - ~ - 77 - 

^7  (0,  0)  f  /r0(dw)  f  ^o(da) 

r  M0(d«) 

=  aegq)-MJg8(^) _ _ 

f  /u0(du) 

Jb6(  o) 

We  use  proposition  A.5(ii).  Supposing  u  ^  E,  for  any  £  >  0  there  exists  <5  small  enough 
such  that 


IbhM °(dM)  <g 
Em^(du) 

We  may  choose  5  =  ^-eM_j^(1)  t0  deduce  that  there  exists  8  small  enough  such  that 

j  <  JSQ*6,  Q)  <  I 
"  Js(0,0)  2 


which  is  a  contradiction,  and  so  u  E  E. 

Knowing  that  (u,  a)  £  E  x  S  we  now  show  that  the  convergence  is  strong.  Any 
convergence  of  the  second  component  will  be  strong  and  so  we  just  need  to  show  that 
u8  — >  u  strongly  in  X.  Suppose  the  convergence  is  not  strong,  then  we  may  use 
proposition  A.5(iii)  on  the  sequence  u8  —  u.  The  same  choice  of  £  as  above  leads  to  the 
same  contradiction,  and  so  we  deduce  that  u  — >  u  strongly  in  X  and  the  first  result  is 
proved. 

(ii)  We  now  show  that  ( u ,  a)  is  a  MAP  estimator  and  minimises  I.  As  in  [9],  and  the  proof  of 
theorem  5.1,  we  can  use  assumptions  4. 1  (iii)  to  see  that 


J8(U8,  a8)  8{Ll+L2)~-§{u6,a6)+<$>(u,a)) 

J8{u,  a)  " 


X 


X 


B6(u6,a6) 


exp(|<F(w^,  a) 


jexp(-|$(w,  a ) 

B6(u,d) 


<&{u8,  a8)\) iiQ{du)uo{da) 

d)|)/r0(dw)^(cl<2) 


5  Remark  A.4  tells  us  that  we  can  separate  the  integrals  in  the  limit  6  [  0. 
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where 


Li  =  max  M3(\\u%  +  6,  a ), 

|aK|fl1|+<5 

L2=  max  M3(||m||x  +  8,  a). 

\a\ ^  +<5 


Therefore  using  the  continuity  of  <f>,  as  shown  in  the  proof  of  existence  of  the  posterior 
distribution,  and  that  ( u a6)  — ►  ( u ,  a )  strongly  in  X  x  A, 

/i0  (d  u)  uq  (da) 

p0  (d  u)  uq  (da) 

Suppose  u 6  is  not  bounded  in  E ,  or  if  it  is,  it  only  converges  weakly  (and  not  strongly)  in 
E.  Then  \\u\\E  <  lim  inf^0  and  hence  for  small  enough  8,  \\u\\E  <  \\u6\\e. 

Therefore,  since  ii0  is  centred  and  || u8  —  u\\x  — ^  0,  | a6  —  a\  — ►  0, 


6io  J  (u>  a) 


lim  sup 


j  c  xx  p0(du)is0(da)  \  p0(du)  /  v0 (da) 

J  Bs  (us, a6)  r  Jbs(u6)  JBs(as) 


=  lim  sup 


11  k '  M  K  n  11111  11  n  r* 

<5|o  /  iJJ()(du)h'Q(da)  <5|o  /  p0(du)  /  v$ (da) 

JBs(a,d)  Jb6(U)  JB6(a ) 

^  lim  sup 


610 


/  M0(d  u)  I  vq  (da) 

Jb6(us)  v  JB6(as) 

J,  -  •  lim  sup - — - 

H0(du)  no 

B6(u ) 


/  t  ^o(da) 

JBs(a) 


S. 


B°  (a°) 


v 0(da) 


^  lim  sup 

<5|o  /  z/0(da) 

JbHo) 


B°(a) 

— 7 —  /  p(a)da 
(a6)  I  ^ 


|B6(a6)|  1 

=  lim  sup  J - 

— 7 -  f  p(a)da 

\BS(a)\  JB*(a)  ^ 


=  1. 


The  final  equality  above  follows  from  the  continuity  of  the  integrand  and  the  fact  that 
| a6  —  a\  — >  0:  both  the  numerator  and  the  denominator  tend  to  p(a). 


Since  by  definition  of  (u6,  a6),  J6(u 6,  a6)  ^  J6(u ,  a)  and  hence 


lim  inf 

<5|  0 


J6(u,  a) 


>  1, 


this  implies  that 


lim 

<5j0 


J6(u ,  a) 


=  1. 


(8.3) 


In  the  case  where  (z/)  converges  strongly  to  u  in  E ,  we  see  from  the  proof  of  lemma  A.2  that 
we  have 
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ci\\a\\l-^\\ue\\l-m^o  x  ^(0>  a<?))  <-  (Mo  x  vq)(Bs(us,  a6)) 

(ji'O  X  v0){B8{Q,  a))  "  (fi0  x  vq)(B8(u,  a)) 

<  e^llfllll-lll^lll+Mg^o  x  fl*». 

(Mo  x  ^o)(^(0,  5))  ’ 

Since  we  have  u6  u  strongly  in  E  we  have  in  particular  that  \\u8\\E  —>  \\u\\E.  It  follows  that 
q\M\e-\\\u8\\e±m8  _ >  l  as  6  i  0.  Now  using  the  continuity  of  p  and  the  fact  that  | a8  —  a\  —>  0, 
an  argument  similar  to  that  in  the  proof  of  lemma  A.  2  shows  that 

r  (M 0  x 

lim — - - - -  =  1. 

NO  (/i0  x  is0)(B8(0,  a)) 

We  therefore  deduce  that 


lim 

NO 


/sS(aV)  M0(dM)^o(d  a) 
/#(«,«)  ^o(d“)^o  (da) 


=  1 


and  (8.3)  follows  again.  Therefore  ( u ,  a)  is  a  MAP  estimator  of  the  measure  pt. 

The  proof  that  ( u ,  a)  minimises  I  is  identical  to  that  in  the  proof  of  theorem  3.5  in  [9].  □ 
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